archive

- Random tools & helpful resources for IRC
git clone git://git.acid.vegas/archive.git
Log | Files | Refs | Archive

stb_image_resize.h (115999B)

      1 /* stb_image_resize - v0.95 - public domain image resizing
      2    by Jorge L Rodriguez (@VinoBS) - 2014
      3    http://github.com/nothings/stb
      4 
      5    Written with emphasis on usability, portability, and efficiency. (No
      6    SIMD or threads, so it be easily outperformed by libs that use those.)
      7    Only scaling and translation is supported, no rotations or shears.
      8    Easy API downsamples w/Mitchell filter, upsamples w/cubic interpolation.
      9 
     10    COMPILING & LINKING
     11       In one C/C++ file that #includes this file, do this:
     12          #define STB_IMAGE_RESIZE_IMPLEMENTATION
     13       before the #include. That will create the implementation in that file.
     14 
     15    QUICKSTART
     16       stbir_resize_uint8(      input_pixels , in_w , in_h , 0,
     17                                output_pixels, out_w, out_h, 0, num_channels)
     18       stbir_resize_float(...)
     19       stbir_resize_uint8_srgb( input_pixels , in_w , in_h , 0,
     20                                output_pixels, out_w, out_h, 0,
     21                                num_channels , alpha_chan  , 0)
     22       stbir_resize_uint8_srgb_edgemode(
     23                                input_pixels , in_w , in_h , 0, 
     24                                output_pixels, out_w, out_h, 0, 
     25                                num_channels , alpha_chan  , 0, STBIR_EDGE_CLAMP)
     26                                                             // WRAP/REFLECT/ZERO
     27 
     28    FULL API
     29       See the "header file" section of the source for API documentation.
     30 
     31    ADDITIONAL DOCUMENTATION
     32 
     33       SRGB & FLOATING POINT REPRESENTATION
     34          The sRGB functions presume IEEE floating point. If you do not have
     35          IEEE floating point, define STBIR_NON_IEEE_FLOAT. This will use
     36          a slower implementation.
     37 
     38       MEMORY ALLOCATION
     39          The resize functions here perform a single memory allocation using
     40          malloc. To control the memory allocation, before the #include that
     41          triggers the implementation, do:
     42 
     43             #define STBIR_MALLOC(size,context) ...
     44             #define STBIR_FREE(ptr,context)   ...
     45 
     46          Each resize function makes exactly one call to malloc/free, so to use
     47          temp memory, store the temp memory in the context and return that.
     48 
     49       ASSERT
     50          Define STBIR_ASSERT(boolval) to override assert() and not use assert.h
     51 
     52       OPTIMIZATION
     53          Define STBIR_SATURATE_INT to compute clamp values in-range using
     54          integer operations instead of float operations. This may be faster
     55          on some platforms.
     56 
     57       DEFAULT FILTERS
     58          For functions which don't provide explicit control over what filters
     59          to use, you can change the compile-time defaults with
     60 
     61             #define STBIR_DEFAULT_FILTER_UPSAMPLE     STBIR_FILTER_something
     62             #define STBIR_DEFAULT_FILTER_DOWNSAMPLE   STBIR_FILTER_something
     63 
     64          See stbir_filter in the header-file section for the list of filters.
     65 
     66       NEW FILTERS
     67          A number of 1D filter kernels are used. For a list of
     68          supported filters see the stbir_filter enum. To add a new filter,
     69          write a filter function and add it to stbir__filter_info_table.
     70 
     71       PROGRESS
     72          For interactive use with slow resize operations, you can install
     73          a progress-report callback:
     74 
     75             #define STBIR_PROGRESS_REPORT(val)   some_func(val)
     76 
     77          The parameter val is a float which goes from 0 to 1 as progress is made.
     78 
     79          For example:
     80 
     81             static void my_progress_report(float progress);
     82             #define STBIR_PROGRESS_REPORT(val) my_progress_report(val)
     83 
     84             #define STB_IMAGE_RESIZE_IMPLEMENTATION
     85             #include "stb_image_resize.h"
     86 
     87             static void my_progress_report(float progress)
     88             {
     89                printf("Progress: %f%%\n", progress*100);
     90             }
     91 
     92       MAX CHANNELS
     93          If your image has more than 64 channels, define STBIR_MAX_CHANNELS
     94          to the max you'll have.
     95 
     96       ALPHA CHANNEL
     97          Most of the resizing functions provide the ability to control how
     98          the alpha channel of an image is processed. The important things
     99          to know about this:
    100 
    101          1. The best mathematically-behaved version of alpha to use is
    102          called "premultiplied alpha", in which the other color channels
    103          have had the alpha value multiplied in. If you use premultiplied
    104          alpha, linear filtering (such as image resampling done by this
    105          library, or performed in texture units on GPUs) does the "right
    106          thing". While premultiplied alpha is standard in the movie CGI
    107          industry, it is still uncommon in the videogame/real-time world.
    108 
    109          If you linearly filter non-premultiplied alpha, strange effects
    110          occur. (For example, the 50/50 average of 99% transparent bright green
    111          and 1% transparent black produces 50% transparent dark green when
    112          non-premultiplied, whereas premultiplied it produces 50%
    113          transparent near-black. The former introduces green energy
    114          that doesn't exist in the source image.)
    115 
    116          2. Artists should not edit premultiplied-alpha images; artists
    117          want non-premultiplied alpha images. Thus, art tools generally output
    118          non-premultiplied alpha images.
    119 
    120          3. You will get best results in most cases by converting images
    121          to premultiplied alpha before processing them mathematically.
    122 
    123          4. If you pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED, the
    124          resizer does not do anything special for the alpha channel;
    125          it is resampled identically to other channels. This produces
    126          the correct results for premultiplied-alpha images, but produces
    127          less-than-ideal results for non-premultiplied-alpha images.
    128 
    129          5. If you do not pass the flag STBIR_FLAG_ALPHA_PREMULTIPLIED,
    130          then the resizer weights the contribution of input pixels
    131          based on their alpha values, or, equivalently, it multiplies
    132          the alpha value into the color channels, resamples, then divides
    133          by the resultant alpha value. Input pixels which have alpha=0 do
    134          not contribute at all to output pixels unless _all_ of the input
    135          pixels affecting that output pixel have alpha=0, in which case
    136          the result for that pixel is the same as it would be without
    137          STBIR_FLAG_ALPHA_PREMULTIPLIED. However, this is only true for
    138          input images in integer formats. For input images in float format,
    139          input pixels with alpha=0 have no effect, and output pixels
    140          which have alpha=0 will be 0 in all channels. (For float images,
    141          you can manually achieve the same result by adding a tiny epsilon
    142          value to the alpha channel of every image, and then subtracting
    143          or clamping it at the end.)
    144 
    145          6. You can suppress the behavior described in #5 and make
    146          all-0-alpha pixels have 0 in all channels by #defining
    147          STBIR_NO_ALPHA_EPSILON.
    148 
    149          7. You can separately control whether the alpha channel is
    150          interpreted as linear or affected by the colorspace. By default
    151          it is linear; you almost never want to apply the colorspace.
    152          (For example, graphics hardware does not apply sRGB conversion
    153          to the alpha channel.)
    154 
    155    CONTRIBUTORS
    156       Jorge L Rodriguez: Implementation
    157       Sean Barrett: API design, optimizations
    158       Aras Pranckevicius: bugfix
    159       Nathan Reed: warning fixes
    160 
    161    REVISIONS
    162       0.95 (2017-07-23) fixed warnings
    163       0.94 (2017-03-18) fixed warnings
    164       0.93 (2017-03-03) fixed bug with certain combinations of heights
    165       0.92 (2017-01-02) fix integer overflow on large (>2GB) images
    166       0.91 (2016-04-02) fix warnings; fix handling of subpixel regions
    167       0.90 (2014-09-17) first released version
    168 
    169    LICENSE
    170      See end of file for license information.
    171 
    172    TODO
    173       Don't decode all of the image data when only processing a partial tile
    174       Don't use full-width decode buffers when only processing a partial tile
    175       When processing wide images, break processing into tiles so data fits in L1 cache
    176       Installable filters?
    177       Resize that respects alpha test coverage
    178          (Reference code: FloatImage::alphaTestCoverage and FloatImage::scaleAlphaToCoverage:
    179          https://code.google.com/p/nvidia-texture-tools/source/browse/trunk/src/nvimage/FloatImage.cpp )
    180 */
    181 
    182 #ifndef STBIR_INCLUDE_STB_IMAGE_RESIZE_H
    183 #define STBIR_INCLUDE_STB_IMAGE_RESIZE_H
    184 
    185 #ifdef _MSC_VER
    186 typedef unsigned char  stbir_uint8;
    187 typedef unsigned short stbir_uint16;
    188 typedef unsigned int   stbir_uint32;
    189 #else
    190 #include <stdint.h>
    191 typedef uint8_t  stbir_uint8;
    192 typedef uint16_t stbir_uint16;
    193 typedef uint32_t stbir_uint32;
    194 #endif
    195 
    196 #ifdef STB_IMAGE_RESIZE_STATIC
    197 #define STBIRDEF static
    198 #else
    199 #ifdef __cplusplus
    200 #define STBIRDEF extern "C"
    201 #else
    202 #define STBIRDEF extern
    203 #endif
    204 #endif
    205 
    206 
    207 //////////////////////////////////////////////////////////////////////////////
    208 //
    209 // Easy-to-use API:
    210 //
    211 //     * "input pixels" points to an array of image data with 'num_channels' channels (e.g. RGB=3, RGBA=4)
    212 //     * input_w is input image width (x-axis), input_h is input image height (y-axis)
    213 //     * stride is the offset between successive rows of image data in memory, in bytes. you can
    214 //       specify 0 to mean packed continuously in memory
    215 //     * alpha channel is treated identically to other channels.
    216 //     * colorspace is linear or sRGB as specified by function name
    217 //     * returned result is 1 for success or 0 in case of an error.
    218 //       #define STBIR_ASSERT() to trigger an assert on parameter validation errors.
    219 //     * Memory required grows approximately linearly with input and output size, but with
    220 //       discontinuities at input_w == output_w and input_h == output_h.
    221 //     * These functions use a "default" resampling filter defined at compile time. To change the filter,
    222 //       you can change the compile-time defaults by #defining STBIR_DEFAULT_FILTER_UPSAMPLE
    223 //       and STBIR_DEFAULT_FILTER_DOWNSAMPLE, or you can use the medium-complexity API.
    224 
    225 STBIRDEF int stbir_resize_uint8(     const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    226                                            unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    227                                      int num_channels);
    228 
    229 STBIRDEF int stbir_resize_float(     const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    230                                            float *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    231                                      int num_channels);
    232 
    233 
    234 // The following functions interpret image data as gamma-corrected sRGB. 
    235 // Specify STBIR_ALPHA_CHANNEL_NONE if you have no alpha channel,
    236 // or otherwise provide the index of the alpha channel. Flags value
    237 // of 0 will probably do the right thing if you're not sure what
    238 // the flags mean.
    239 
    240 #define STBIR_ALPHA_CHANNEL_NONE       -1
    241 
    242 // Set this flag if your texture has premultiplied alpha. Otherwise, stbir will
    243 // use alpha-weighted resampling (effectively premultiplying, resampling,
    244 // then unpremultiplying).
    245 #define STBIR_FLAG_ALPHA_PREMULTIPLIED    (1 << 0)
    246 // The specified alpha channel should be handled as gamma-corrected value even
    247 // when doing sRGB operations.
    248 #define STBIR_FLAG_ALPHA_USES_COLORSPACE  (1 << 1)
    249 
    250 STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    251                                            unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    252                                      int num_channels, int alpha_channel, int flags);
    253 
    254 
    255 typedef enum
    256 {
    257     STBIR_EDGE_CLAMP   = 1,
    258     STBIR_EDGE_REFLECT = 2,
    259     STBIR_EDGE_WRAP    = 3,
    260     STBIR_EDGE_ZERO    = 4,
    261 } stbir_edge;
    262 
    263 // This function adds the ability to specify how requests to sample off the edge of the image are handled.
    264 STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    265                                                     unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    266                                               int num_channels, int alpha_channel, int flags,
    267                                               stbir_edge edge_wrap_mode);
    268 
    269 //////////////////////////////////////////////////////////////////////////////
    270 //
    271 // Medium-complexity API
    272 //
    273 // This extends the easy-to-use API as follows:
    274 //
    275 //     * Alpha-channel can be processed separately
    276 //       * If alpha_channel is not STBIR_ALPHA_CHANNEL_NONE
    277 //         * Alpha channel will not be gamma corrected (unless flags&STBIR_FLAG_GAMMA_CORRECT)
    278 //         * Filters will be weighted by alpha channel (unless flags&STBIR_FLAG_ALPHA_PREMULTIPLIED)
    279 //     * Filter can be selected explicitly
    280 //     * uint16 image type
    281 //     * sRGB colorspace available for all types
    282 //     * context parameter for passing to STBIR_MALLOC
    283 
    284 typedef enum
    285 {
    286     STBIR_FILTER_DEFAULT      = 0,  // use same filter type that easy-to-use API chooses
    287     STBIR_FILTER_BOX          = 1,  // A trapezoid w/1-pixel wide ramps, same result as box for integer scale ratios
    288     STBIR_FILTER_TRIANGLE     = 2,  // On upsampling, produces same results as bilinear texture filtering
    289     STBIR_FILTER_CUBICBSPLINE = 3,  // The cubic b-spline (aka Mitchell-Netrevalli with B=1,C=0), gaussian-esque
    290     STBIR_FILTER_CATMULLROM   = 4,  // An interpolating cubic spline
    291     STBIR_FILTER_MITCHELL     = 5,  // Mitchell-Netrevalli filter with B=1/3, C=1/3
    292 } stbir_filter;
    293 
    294 typedef enum
    295 {
    296     STBIR_COLORSPACE_LINEAR,
    297     STBIR_COLORSPACE_SRGB,
    298 
    299     STBIR_MAX_COLORSPACES,
    300 } stbir_colorspace;
    301 
    302 // The following functions are all identical except for the type of the image data
    303 
    304 STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    305                                                unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    306                                          int num_channels, int alpha_channel, int flags,
    307                                          stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
    308                                          void *alloc_context);
    309 
    310 STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels  , int input_w , int input_h , int input_stride_in_bytes,
    311                                                stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes,
    312                                          int num_channels, int alpha_channel, int flags,
    313                                          stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
    314                                          void *alloc_context);
    315 
    316 STBIRDEF int stbir_resize_float_generic( const float *input_pixels         , int input_w , int input_h , int input_stride_in_bytes,
    317                                                float *output_pixels        , int output_w, int output_h, int output_stride_in_bytes,
    318                                          int num_channels, int alpha_channel, int flags,
    319                                          stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
    320                                          void *alloc_context);
    321 
    322 
    323 
    324 //////////////////////////////////////////////////////////////////////////////
    325 //
    326 // Full-complexity API
    327 //
    328 // This extends the medium API as follows:
    329 //
    330 //       * uint32 image type
    331 //     * not typesafe
    332 //     * separate filter types for each axis
    333 //     * separate edge modes for each axis
    334 //     * can specify scale explicitly for subpixel correctness
    335 //     * can specify image source tile using texture coordinates
    336 
    337 typedef enum
    338 {
    339     STBIR_TYPE_UINT8 ,
    340     STBIR_TYPE_UINT16,
    341     STBIR_TYPE_UINT32,
    342     STBIR_TYPE_FLOAT ,
    343 
    344     STBIR_MAX_TYPES
    345 } stbir_datatype;
    346 
    347 STBIRDEF int stbir_resize(         const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    348                                          void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    349                                    stbir_datatype datatype,
    350                                    int num_channels, int alpha_channel, int flags,
    351                                    stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
    352                                    stbir_filter filter_horizontal,  stbir_filter filter_vertical,
    353                                    stbir_colorspace space, void *alloc_context);
    354 
    355 STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    356                                          void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    357                                    stbir_datatype datatype,
    358                                    int num_channels, int alpha_channel, int flags,
    359                                    stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
    360                                    stbir_filter filter_horizontal,  stbir_filter filter_vertical,
    361                                    stbir_colorspace space, void *alloc_context,
    362                                    float x_scale, float y_scale,
    363                                    float x_offset, float y_offset);
    364 
    365 STBIRDEF int stbir_resize_region(  const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
    366                                          void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
    367                                    stbir_datatype datatype,
    368                                    int num_channels, int alpha_channel, int flags,
    369                                    stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
    370                                    stbir_filter filter_horizontal,  stbir_filter filter_vertical,
    371                                    stbir_colorspace space, void *alloc_context,
    372                                    float s0, float t0, float s1, float t1);
    373 // (s0, t0) & (s1, t1) are the top-left and bottom right corner (uv addressing style: [0, 1]x[0, 1]) of a region of the input image to use.
    374 
    375 //
    376 //
    377 ////   end header file   /////////////////////////////////////////////////////
    378 #endif // STBIR_INCLUDE_STB_IMAGE_RESIZE_H
    379 
    380 
    381 
    382 
    383 
    384 #ifdef STB_IMAGE_RESIZE_IMPLEMENTATION
    385 
    386 #ifndef STBIR_ASSERT
    387 #include <assert.h>
    388 #define STBIR_ASSERT(x) assert(x)
    389 #endif
    390 
    391 // For memset
    392 #include <string.h>
    393 
    394 #include <math.h>
    395 
    396 #ifndef STBIR_MALLOC
    397 #include <stdlib.h>
    398 // use comma operator to evaluate c, to avoid "unused parameter" warnings
    399 #define STBIR_MALLOC(size,c) ((void)(c), malloc(size))
    400 #define STBIR_FREE(ptr,c)    ((void)(c), free(ptr))
    401 #endif
    402 
    403 #ifndef _MSC_VER
    404 #ifdef __cplusplus
    405 #define stbir__inline inline
    406 #else
    407 #define stbir__inline
    408 #endif
    409 #else
    410 #define stbir__inline __forceinline
    411 #endif
    412 
    413 
    414 // should produce compiler error if size is wrong
    415 typedef unsigned char stbir__validate_uint32[sizeof(stbir_uint32) == 4 ? 1 : -1];
    416 
    417 #ifdef _MSC_VER
    418 #define STBIR__NOTUSED(v)  (void)(v)
    419 #else
    420 #define STBIR__NOTUSED(v)  (void)sizeof(v)
    421 #endif
    422 
    423 #define STBIR__ARRAY_SIZE(a) (sizeof((a))/sizeof((a)[0]))
    424 
    425 #ifndef STBIR_DEFAULT_FILTER_UPSAMPLE
    426 #define STBIR_DEFAULT_FILTER_UPSAMPLE    STBIR_FILTER_CATMULLROM
    427 #endif
    428 
    429 #ifndef STBIR_DEFAULT_FILTER_DOWNSAMPLE
    430 #define STBIR_DEFAULT_FILTER_DOWNSAMPLE  STBIR_FILTER_MITCHELL
    431 #endif
    432 
    433 #ifndef STBIR_PROGRESS_REPORT
    434 #define STBIR_PROGRESS_REPORT(float_0_to_1)
    435 #endif
    436 
    437 #ifndef STBIR_MAX_CHANNELS
    438 #define STBIR_MAX_CHANNELS 64
    439 #endif
    440 
    441 #if STBIR_MAX_CHANNELS > 65536
    442 #error "Too many channels; STBIR_MAX_CHANNELS must be no more than 65536."
    443 // because we store the indices in 16-bit variables
    444 #endif
    445 
    446 // This value is added to alpha just before premultiplication to avoid
    447 // zeroing out color values. It is equivalent to 2^-80. If you don't want
    448 // that behavior (it may interfere if you have floating point images with
    449 // very small alpha values) then you can define STBIR_NO_ALPHA_EPSILON to
    450 // disable it.
    451 #ifndef STBIR_ALPHA_EPSILON
    452 #define STBIR_ALPHA_EPSILON ((float)1 / (1 << 20) / (1 << 20) / (1 << 20) / (1 << 20))
    453 #endif
    454 
    455 
    456 
    457 #ifdef _MSC_VER
    458 #define STBIR__UNUSED_PARAM(v)  (void)(v)
    459 #else
    460 #define STBIR__UNUSED_PARAM(v)  (void)sizeof(v)
    461 #endif
    462 
    463 // must match stbir_datatype
    464 static unsigned char stbir__type_size[] = {
    465     1, // STBIR_TYPE_UINT8
    466     2, // STBIR_TYPE_UINT16
    467     4, // STBIR_TYPE_UINT32
    468     4, // STBIR_TYPE_FLOAT
    469 };
    470 
    471 // Kernel function centered at 0
    472 typedef float (stbir__kernel_fn)(float x, float scale);
    473 typedef float (stbir__support_fn)(float scale);
    474 
    475 typedef struct
    476 {
    477     stbir__kernel_fn* kernel;
    478     stbir__support_fn* support;
    479 } stbir__filter_info;
    480 
    481 // When upsampling, the contributors are which source pixels contribute.
    482 // When downsampling, the contributors are which destination pixels are contributed to.
    483 typedef struct
    484 {
    485     int n0; // First contributing pixel
    486     int n1; // Last contributing pixel
    487 } stbir__contributors;
    488 
    489 typedef struct
    490 {
    491     const void* input_data;
    492     int input_w;
    493     int input_h;
    494     int input_stride_bytes;
    495 
    496     void* output_data;
    497     int output_w;
    498     int output_h;
    499     int output_stride_bytes;
    500 
    501     float s0, t0, s1, t1;
    502 
    503     float horizontal_shift; // Units: output pixels
    504     float vertical_shift;   // Units: output pixels
    505     float horizontal_scale;
    506     float vertical_scale;
    507 
    508     int channels;
    509     int alpha_channel;
    510     stbir_uint32 flags;
    511     stbir_datatype type;
    512     stbir_filter horizontal_filter;
    513     stbir_filter vertical_filter;
    514     stbir_edge edge_horizontal;
    515     stbir_edge edge_vertical;
    516     stbir_colorspace colorspace;
    517 
    518     stbir__contributors* horizontal_contributors;
    519     float* horizontal_coefficients;
    520 
    521     stbir__contributors* vertical_contributors;
    522     float* vertical_coefficients;
    523 
    524     int decode_buffer_pixels;
    525     float* decode_buffer;
    526 
    527     float* horizontal_buffer;
    528 
    529     // cache these because ceil/floor are inexplicably showing up in profile
    530     int horizontal_coefficient_width;
    531     int vertical_coefficient_width;
    532     int horizontal_filter_pixel_width;
    533     int vertical_filter_pixel_width;
    534     int horizontal_filter_pixel_margin;
    535     int vertical_filter_pixel_margin;
    536     int horizontal_num_contributors;
    537     int vertical_num_contributors;
    538 
    539     int ring_buffer_length_bytes;   // The length of an individual entry in the ring buffer. The total number of ring buffers is stbir__get_filter_pixel_width(filter)
    540     int ring_buffer_num_entries;    // Total number of entries in the ring buffer.
    541     int ring_buffer_first_scanline;
    542     int ring_buffer_last_scanline;
    543     int ring_buffer_begin_index;    // first_scanline is at this index in the ring buffer
    544     float* ring_buffer;
    545 
    546     float* encode_buffer; // A temporary buffer to store floats so we don't lose precision while we do multiply-adds.
    547 
    548     int horizontal_contributors_size;
    549     int horizontal_coefficients_size;
    550     int vertical_contributors_size;
    551     int vertical_coefficients_size;
    552     int decode_buffer_size;
    553     int horizontal_buffer_size;
    554     int ring_buffer_size;
    555     int encode_buffer_size;
    556 } stbir__info;
    557 
    558 
    559 static const float stbir__max_uint8_as_float  = 255.0f;
    560 static const float stbir__max_uint16_as_float = 65535.0f;
    561 static const double stbir__max_uint32_as_float = 4294967295.0;
    562 
    563 
    564 static stbir__inline int stbir__min(int a, int b)
    565 {
    566     return a < b ? a : b;
    567 }
    568 
    569 static stbir__inline float stbir__saturate(float x)
    570 {
    571     if (x < 0)
    572         return 0;
    573 
    574     if (x > 1)
    575         return 1;
    576 
    577     return x;
    578 }
    579 
    580 #ifdef STBIR_SATURATE_INT
    581 static stbir__inline stbir_uint8 stbir__saturate8(int x)
    582 {
    583     if ((unsigned int) x <= 255)
    584         return x;
    585 
    586     if (x < 0)
    587         return 0;
    588 
    589     return 255;
    590 }
    591 
    592 static stbir__inline stbir_uint16 stbir__saturate16(int x)
    593 {
    594     if ((unsigned int) x <= 65535)
    595         return x;
    596 
    597     if (x < 0)
    598         return 0;
    599 
    600     return 65535;
    601 }
    602 #endif
    603 
    604 static float stbir__srgb_uchar_to_linear_float[256] = {
    605     0.000000f, 0.000304f, 0.000607f, 0.000911f, 0.001214f, 0.001518f, 0.001821f, 0.002125f, 0.002428f, 0.002732f, 0.003035f,
    606     0.003347f, 0.003677f, 0.004025f, 0.004391f, 0.004777f, 0.005182f, 0.005605f, 0.006049f, 0.006512f, 0.006995f, 0.007499f,
    607     0.008023f, 0.008568f, 0.009134f, 0.009721f, 0.010330f, 0.010960f, 0.011612f, 0.012286f, 0.012983f, 0.013702f, 0.014444f,
    608     0.015209f, 0.015996f, 0.016807f, 0.017642f, 0.018500f, 0.019382f, 0.020289f, 0.021219f, 0.022174f, 0.023153f, 0.024158f,
    609     0.025187f, 0.026241f, 0.027321f, 0.028426f, 0.029557f, 0.030713f, 0.031896f, 0.033105f, 0.034340f, 0.035601f, 0.036889f,
    610     0.038204f, 0.039546f, 0.040915f, 0.042311f, 0.043735f, 0.045186f, 0.046665f, 0.048172f, 0.049707f, 0.051269f, 0.052861f,
    611     0.054480f, 0.056128f, 0.057805f, 0.059511f, 0.061246f, 0.063010f, 0.064803f, 0.066626f, 0.068478f, 0.070360f, 0.072272f,
    612     0.074214f, 0.076185f, 0.078187f, 0.080220f, 0.082283f, 0.084376f, 0.086500f, 0.088656f, 0.090842f, 0.093059f, 0.095307f,
    613     0.097587f, 0.099899f, 0.102242f, 0.104616f, 0.107023f, 0.109462f, 0.111932f, 0.114435f, 0.116971f, 0.119538f, 0.122139f,
    614     0.124772f, 0.127438f, 0.130136f, 0.132868f, 0.135633f, 0.138432f, 0.141263f, 0.144128f, 0.147027f, 0.149960f, 0.152926f,
    615     0.155926f, 0.158961f, 0.162029f, 0.165132f, 0.168269f, 0.171441f, 0.174647f, 0.177888f, 0.181164f, 0.184475f, 0.187821f,
    616     0.191202f, 0.194618f, 0.198069f, 0.201556f, 0.205079f, 0.208637f, 0.212231f, 0.215861f, 0.219526f, 0.223228f, 0.226966f,
    617     0.230740f, 0.234551f, 0.238398f, 0.242281f, 0.246201f, 0.250158f, 0.254152f, 0.258183f, 0.262251f, 0.266356f, 0.270498f,
    618     0.274677f, 0.278894f, 0.283149f, 0.287441f, 0.291771f, 0.296138f, 0.300544f, 0.304987f, 0.309469f, 0.313989f, 0.318547f,
    619     0.323143f, 0.327778f, 0.332452f, 0.337164f, 0.341914f, 0.346704f, 0.351533f, 0.356400f, 0.361307f, 0.366253f, 0.371238f,
    620     0.376262f, 0.381326f, 0.386430f, 0.391573f, 0.396755f, 0.401978f, 0.407240f, 0.412543f, 0.417885f, 0.423268f, 0.428691f,
    621     0.434154f, 0.439657f, 0.445201f, 0.450786f, 0.456411f, 0.462077f, 0.467784f, 0.473532f, 0.479320f, 0.485150f, 0.491021f,
    622     0.496933f, 0.502887f, 0.508881f, 0.514918f, 0.520996f, 0.527115f, 0.533276f, 0.539480f, 0.545725f, 0.552011f, 0.558340f,
    623     0.564712f, 0.571125f, 0.577581f, 0.584078f, 0.590619f, 0.597202f, 0.603827f, 0.610496f, 0.617207f, 0.623960f, 0.630757f,
    624     0.637597f, 0.644480f, 0.651406f, 0.658375f, 0.665387f, 0.672443f, 0.679543f, 0.686685f, 0.693872f, 0.701102f, 0.708376f,
    625     0.715694f, 0.723055f, 0.730461f, 0.737911f, 0.745404f, 0.752942f, 0.760525f, 0.768151f, 0.775822f, 0.783538f, 0.791298f,
    626     0.799103f, 0.806952f, 0.814847f, 0.822786f, 0.830770f, 0.838799f, 0.846873f, 0.854993f, 0.863157f, 0.871367f, 0.879622f,
    627     0.887923f, 0.896269f, 0.904661f, 0.913099f, 0.921582f, 0.930111f, 0.938686f, 0.947307f, 0.955974f, 0.964686f, 0.973445f,
    628     0.982251f, 0.991102f, 1.0f
    629 };
    630 
    631 static float stbir__srgb_to_linear(float f)
    632 {
    633     if (f <= 0.04045f)
    634         return f / 12.92f;
    635     else
    636         return (float)pow((f + 0.055f) / 1.055f, 2.4f);
    637 }
    638 
    639 static float stbir__linear_to_srgb(float f)
    640 {
    641     if (f <= 0.0031308f)
    642         return f * 12.92f;
    643     else
    644         return 1.055f * (float)pow(f, 1 / 2.4f) - 0.055f;
    645 }
    646 
    647 #ifndef STBIR_NON_IEEE_FLOAT
    648 // From https://gist.github.com/rygorous/2203834
    649 
    650 typedef union
    651 {
    652     stbir_uint32 u;
    653     float f;
    654 } stbir__FP32;
    655 
    656 static const stbir_uint32 fp32_to_srgb8_tab4[104] = {
    657     0x0073000d, 0x007a000d, 0x0080000d, 0x0087000d, 0x008d000d, 0x0094000d, 0x009a000d, 0x00a1000d,
    658     0x00a7001a, 0x00b4001a, 0x00c1001a, 0x00ce001a, 0x00da001a, 0x00e7001a, 0x00f4001a, 0x0101001a,
    659     0x010e0033, 0x01280033, 0x01410033, 0x015b0033, 0x01750033, 0x018f0033, 0x01a80033, 0x01c20033,
    660     0x01dc0067, 0x020f0067, 0x02430067, 0x02760067, 0x02aa0067, 0x02dd0067, 0x03110067, 0x03440067,
    661     0x037800ce, 0x03df00ce, 0x044600ce, 0x04ad00ce, 0x051400ce, 0x057b00c5, 0x05dd00bc, 0x063b00b5,
    662     0x06970158, 0x07420142, 0x07e30130, 0x087b0120, 0x090b0112, 0x09940106, 0x0a1700fc, 0x0a9500f2,
    663     0x0b0f01cb, 0x0bf401ae, 0x0ccb0195, 0x0d950180, 0x0e56016e, 0x0f0d015e, 0x0fbc0150, 0x10630143,
    664     0x11070264, 0x1238023e, 0x1357021d, 0x14660201, 0x156601e9, 0x165a01d3, 0x174401c0, 0x182401af,
    665     0x18fe0331, 0x1a9602fe, 0x1c1502d2, 0x1d7e02ad, 0x1ed4028d, 0x201a0270, 0x21520256, 0x227d0240,
    666     0x239f0443, 0x25c003fe, 0x27bf03c4, 0x29a10392, 0x2b6a0367, 0x2d1d0341, 0x2ebe031f, 0x304d0300,
    667     0x31d105b0, 0x34a80555, 0x37520507, 0x39d504c5, 0x3c37048b, 0x3e7c0458, 0x40a8042a, 0x42bd0401,
    668     0x44c20798, 0x488e071e, 0x4c1c06b6, 0x4f76065d, 0x52a50610, 0x55ac05cc, 0x5892058f, 0x5b590559,
    669     0x5e0c0a23, 0x631c0980, 0x67db08f6, 0x6c55087f, 0x70940818, 0x74a007bd, 0x787d076c, 0x7c330723,
    670 };
    671  
    672 static stbir_uint8 stbir__linear_to_srgb_uchar(float in)
    673 {
    674     static const stbir__FP32 almostone = { 0x3f7fffff }; // 1-eps
    675     static const stbir__FP32 minval = { (127-13) << 23 };
    676     stbir_uint32 tab,bias,scale,t;
    677     stbir__FP32 f;
    678  
    679     // Clamp to [2^(-13), 1-eps]; these two values map to 0 and 1, respectively.
    680     // The tests are carefully written so that NaNs map to 0, same as in the reference
    681     // implementation.
    682     if (!(in > minval.f)) // written this way to catch NaNs
    683         in = minval.f;
    684     if (in > almostone.f)
    685         in = almostone.f;
    686  
    687     // Do the table lookup and unpack bias, scale
    688     f.f = in;
    689     tab = fp32_to_srgb8_tab4[(f.u - minval.u) >> 20];
    690     bias = (tab >> 16) << 9;
    691     scale = tab & 0xffff;
    692  
    693     // Grab next-highest mantissa bits and perform linear interpolation
    694     t = (f.u >> 12) & 0xff;
    695     return (unsigned char) ((bias + scale*t) >> 16);
    696 }
    697 
    698 #else
    699 // sRGB transition values, scaled by 1<<28
    700 static int stbir__srgb_offset_to_linear_scaled[256] =
    701 {
    702             0,     40738,    122216,    203693,    285170,    366648,    448125,    529603,
    703        611080,    692557,    774035,    855852,    942009,   1033024,   1128971,   1229926,
    704       1335959,   1447142,   1563542,   1685229,   1812268,   1944725,   2082664,   2226148,
    705       2375238,   2529996,   2690481,   2856753,   3028870,   3206888,   3390865,   3580856,
    706       3776916,   3979100,   4187460,   4402049,   4622919,   4850123,   5083710,   5323731,
    707       5570236,   5823273,   6082892,   6349140,   6622065,   6901714,   7188133,   7481369,
    708       7781466,   8088471,   8402427,   8723380,   9051372,   9386448,   9728650,  10078021,
    709      10434603,  10798439,  11169569,  11548036,  11933879,  12327139,  12727857,  13136073,
    710      13551826,  13975156,  14406100,  14844697,  15290987,  15745007,  16206795,  16676389,
    711      17153826,  17639142,  18132374,  18633560,  19142734,  19659934,  20185196,  20718552,
    712      21260042,  21809696,  22367554,  22933648,  23508010,  24090680,  24681686,  25281066,
    713      25888850,  26505076,  27129772,  27762974,  28404716,  29055026,  29713942,  30381490,
    714      31057708,  31742624,  32436272,  33138682,  33849884,  34569912,  35298800,  36036568,
    715      36783260,  37538896,  38303512,  39077136,  39859796,  40651528,  41452360,  42262316,
    716      43081432,  43909732,  44747252,  45594016,  46450052,  47315392,  48190064,  49074096,
    717      49967516,  50870356,  51782636,  52704392,  53635648,  54576432,  55526772,  56486700,
    718      57456236,  58435408,  59424248,  60422780,  61431036,  62449032,  63476804,  64514376,
    719      65561776,  66619028,  67686160,  68763192,  69850160,  70947088,  72053992,  73170912,
    720      74297864,  75434880,  76581976,  77739184,  78906536,  80084040,  81271736,  82469648,
    721      83677792,  84896192,  86124888,  87363888,  88613232,  89872928,  91143016,  92423512,
    722      93714432,  95015816,  96327688,  97650056,  98982952, 100326408, 101680440, 103045072,
    723     104420320, 105806224, 107202800, 108610064, 110028048, 111456776, 112896264, 114346544,
    724     115807632, 117279552, 118762328, 120255976, 121760536, 123276016, 124802440, 126339832,
    725     127888216, 129447616, 131018048, 132599544, 134192112, 135795792, 137410592, 139036528,
    726     140673648, 142321952, 143981456, 145652208, 147334208, 149027488, 150732064, 152447968,
    727     154175200, 155913792, 157663776, 159425168, 161197984, 162982240, 164777968, 166585184,
    728     168403904, 170234160, 172075968, 173929344, 175794320, 177670896, 179559120, 181458992,
    729     183370528, 185293776, 187228736, 189175424, 191133888, 193104112, 195086128, 197079968,
    730     199085648, 201103184, 203132592, 205173888, 207227120, 209292272, 211369392, 213458480,
    731     215559568, 217672656, 219797792, 221934976, 224084240, 226245600, 228419056, 230604656,
    732     232802400, 235012320, 237234432, 239468736, 241715280, 243974080, 246245120, 248528464,
    733     250824112, 253132064, 255452368, 257785040, 260130080, 262487520, 264857376, 267239664,
    734 };
    735 
    736 static stbir_uint8 stbir__linear_to_srgb_uchar(float f)
    737 {
    738     int x = (int) (f * (1 << 28)); // has headroom so you don't need to clamp
    739     int v = 0;
    740     int i;
    741 
    742     // Refine the guess with a short binary search.
    743     i = v + 128; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    744     i = v +  64; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    745     i = v +  32; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    746     i = v +  16; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    747     i = v +   8; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    748     i = v +   4; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    749     i = v +   2; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    750     i = v +   1; if (x >= stbir__srgb_offset_to_linear_scaled[i]) v = i;
    751 
    752     return (stbir_uint8) v;
    753 }
    754 #endif
    755 
    756 static float stbir__filter_trapezoid(float x, float scale)
    757 {
    758     float halfscale = scale / 2;
    759     float t = 0.5f + halfscale;
    760     STBIR_ASSERT(scale <= 1);
    761 
    762     x = (float)fabs(x);
    763 
    764     if (x >= t)
    765         return 0;
    766     else
    767     {
    768         float r = 0.5f - halfscale;
    769         if (x <= r)
    770             return 1;
    771         else
    772             return (t - x) / scale;
    773     }
    774 }
    775 
    776 static float stbir__support_trapezoid(float scale)
    777 {
    778     STBIR_ASSERT(scale <= 1);
    779     return 0.5f + scale / 2;
    780 }
    781 
    782 static float stbir__filter_triangle(float x, float s)
    783 {
    784     STBIR__UNUSED_PARAM(s);
    785 
    786     x = (float)fabs(x);
    787 
    788     if (x <= 1.0f)
    789         return 1 - x;
    790     else
    791         return 0;
    792 }
    793 
    794 static float stbir__filter_cubic(float x, float s)
    795 {
    796     STBIR__UNUSED_PARAM(s);
    797 
    798     x = (float)fabs(x);
    799 
    800     if (x < 1.0f)
    801         return (4 + x*x*(3*x - 6))/6;
    802     else if (x < 2.0f)
    803         return (8 + x*(-12 + x*(6 - x)))/6;
    804 
    805     return (0.0f);
    806 }
    807 
    808 static float stbir__filter_catmullrom(float x, float s)
    809 {
    810     STBIR__UNUSED_PARAM(s);
    811 
    812     x = (float)fabs(x);
    813 
    814     if (x < 1.0f)
    815         return 1 - x*x*(2.5f - 1.5f*x);
    816     else if (x < 2.0f)
    817         return 2 - x*(4 + x*(0.5f*x - 2.5f));
    818 
    819     return (0.0f);
    820 }
    821 
    822 static float stbir__filter_mitchell(float x, float s)
    823 {
    824     STBIR__UNUSED_PARAM(s);
    825 
    826     x = (float)fabs(x);
    827 
    828     if (x < 1.0f)
    829         return (16 + x*x*(21 * x - 36))/18;
    830     else if (x < 2.0f)
    831         return (32 + x*(-60 + x*(36 - 7*x)))/18;
    832 
    833     return (0.0f);
    834 }
    835 
    836 static float stbir__support_zero(float s)
    837 {
    838     STBIR__UNUSED_PARAM(s);
    839     return 0;
    840 }
    841 
    842 static float stbir__support_one(float s)
    843 {
    844     STBIR__UNUSED_PARAM(s);
    845     return 1;
    846 }
    847 
    848 static float stbir__support_two(float s)
    849 {
    850     STBIR__UNUSED_PARAM(s);
    851     return 2;
    852 }
    853 
    854 static stbir__filter_info stbir__filter_info_table[] = {
    855         { NULL,                     stbir__support_zero },
    856         { stbir__filter_trapezoid,  stbir__support_trapezoid },
    857         { stbir__filter_triangle,   stbir__support_one },
    858         { stbir__filter_cubic,      stbir__support_two },
    859         { stbir__filter_catmullrom, stbir__support_two },
    860         { stbir__filter_mitchell,   stbir__support_two },
    861 };
    862 
    863 stbir__inline static int stbir__use_upsampling(float ratio)
    864 {
    865     return ratio > 1;
    866 }
    867 
    868 stbir__inline static int stbir__use_width_upsampling(stbir__info* stbir_info)
    869 {
    870     return stbir__use_upsampling(stbir_info->horizontal_scale);
    871 }
    872 
    873 stbir__inline static int stbir__use_height_upsampling(stbir__info* stbir_info)
    874 {
    875     return stbir__use_upsampling(stbir_info->vertical_scale);
    876 }
    877 
    878 // This is the maximum number of input samples that can affect an output sample
    879 // with the given filter
    880 static int stbir__get_filter_pixel_width(stbir_filter filter, float scale)
    881 {
    882     STBIR_ASSERT(filter != 0);
    883     STBIR_ASSERT(filter < STBIR__ARRAY_SIZE(stbir__filter_info_table));
    884 
    885     if (stbir__use_upsampling(scale))
    886         return (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2);
    887     else
    888         return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2 / scale);
    889 }
    890 
    891 // This is how much to expand buffers to account for filters seeking outside
    892 // the image boundaries.
    893 static int stbir__get_filter_pixel_margin(stbir_filter filter, float scale)
    894 {
    895     return stbir__get_filter_pixel_width(filter, scale) / 2;
    896 }
    897 
    898 static int stbir__get_coefficient_width(stbir_filter filter, float scale)
    899 {
    900     if (stbir__use_upsampling(scale))
    901         return (int)ceil(stbir__filter_info_table[filter].support(1 / scale) * 2);
    902     else
    903         return (int)ceil(stbir__filter_info_table[filter].support(scale) * 2);
    904 }
    905 
    906 static int stbir__get_contributors(float scale, stbir_filter filter, int input_size, int output_size)
    907 {
    908     if (stbir__use_upsampling(scale))
    909         return output_size;
    910     else
    911         return (input_size + stbir__get_filter_pixel_margin(filter, scale) * 2);
    912 }
    913 
    914 static int stbir__get_total_horizontal_coefficients(stbir__info* info)
    915 {
    916     return info->horizontal_num_contributors
    917          * stbir__get_coefficient_width      (info->horizontal_filter, info->horizontal_scale);
    918 }
    919 
    920 static int stbir__get_total_vertical_coefficients(stbir__info* info)
    921 {
    922     return info->vertical_num_contributors
    923          * stbir__get_coefficient_width      (info->vertical_filter, info->vertical_scale);
    924 }
    925 
    926 static stbir__contributors* stbir__get_contributor(stbir__contributors* contributors, int n)
    927 {
    928     return &contributors[n];
    929 }
    930 
    931 // For perf reasons this code is duplicated in stbir__resample_horizontal_upsample/downsample,
    932 // if you change it here change it there too.
    933 static float* stbir__get_coefficient(float* coefficients, stbir_filter filter, float scale, int n, int c)
    934 {
    935     int width = stbir__get_coefficient_width(filter, scale);
    936     return &coefficients[width*n + c];
    937 }
    938 
    939 static int stbir__edge_wrap_slow(stbir_edge edge, int n, int max)
    940 {
    941     switch (edge)
    942     {
    943     case STBIR_EDGE_ZERO:
    944         return 0; // we'll decode the wrong pixel here, and then overwrite with 0s later
    945 
    946     case STBIR_EDGE_CLAMP:
    947         if (n < 0)
    948             return 0;
    949 
    950         if (n >= max)
    951             return max - 1;
    952 
    953         return n; // NOTREACHED
    954 
    955     case STBIR_EDGE_REFLECT:
    956     {
    957         if (n < 0)
    958         {
    959             if (n < max)
    960                 return -n;
    961             else
    962                 return max - 1;
    963         }
    964 
    965         if (n >= max)
    966         {
    967             int max2 = max * 2;
    968             if (n >= max2)
    969                 return 0;
    970             else
    971                 return max2 - n - 1;
    972         }
    973 
    974         return n; // NOTREACHED
    975     }
    976 
    977     case STBIR_EDGE_WRAP:
    978         if (n >= 0)
    979             return (n % max);
    980         else
    981         {
    982             int m = (-n) % max;
    983 
    984             if (m != 0)
    985                 m = max - m;
    986 
    987             return (m);
    988         }
    989         // NOTREACHED
    990 
    991     default:
    992         STBIR_ASSERT(!"Unimplemented edge type");
    993         return 0;
    994     }
    995 }
    996 
    997 stbir__inline static int stbir__edge_wrap(stbir_edge edge, int n, int max)
    998 {
    999     // avoid per-pixel switch
   1000     if (n >= 0 && n < max)
   1001         return n;
   1002     return stbir__edge_wrap_slow(edge, n, max);
   1003 }
   1004 
   1005 // What input pixels contribute to this output pixel?
   1006 static void stbir__calculate_sample_range_upsample(int n, float out_filter_radius, float scale_ratio, float out_shift, int* in_first_pixel, int* in_last_pixel, float* in_center_of_out)
   1007 {
   1008     float out_pixel_center = (float)n + 0.5f;
   1009     float out_pixel_influence_lowerbound = out_pixel_center - out_filter_radius;
   1010     float out_pixel_influence_upperbound = out_pixel_center + out_filter_radius;
   1011 
   1012     float in_pixel_influence_lowerbound = (out_pixel_influence_lowerbound + out_shift) / scale_ratio;
   1013     float in_pixel_influence_upperbound = (out_pixel_influence_upperbound + out_shift) / scale_ratio;
   1014 
   1015     *in_center_of_out = (out_pixel_center + out_shift) / scale_ratio;
   1016     *in_first_pixel = (int)(floor(in_pixel_influence_lowerbound + 0.5));
   1017     *in_last_pixel = (int)(floor(in_pixel_influence_upperbound - 0.5));
   1018 }
   1019 
   1020 // What output pixels does this input pixel contribute to?
   1021 static void stbir__calculate_sample_range_downsample(int n, float in_pixels_radius, float scale_ratio, float out_shift, int* out_first_pixel, int* out_last_pixel, float* out_center_of_in)
   1022 {
   1023     float in_pixel_center = (float)n + 0.5f;
   1024     float in_pixel_influence_lowerbound = in_pixel_center - in_pixels_radius;
   1025     float in_pixel_influence_upperbound = in_pixel_center + in_pixels_radius;
   1026 
   1027     float out_pixel_influence_lowerbound = in_pixel_influence_lowerbound * scale_ratio - out_shift;
   1028     float out_pixel_influence_upperbound = in_pixel_influence_upperbound * scale_ratio - out_shift;
   1029 
   1030     *out_center_of_in = in_pixel_center * scale_ratio - out_shift;
   1031     *out_first_pixel = (int)(floor(out_pixel_influence_lowerbound + 0.5));
   1032     *out_last_pixel = (int)(floor(out_pixel_influence_upperbound - 0.5));
   1033 }
   1034 
   1035 static void stbir__calculate_coefficients_upsample(stbir_filter filter, float scale, int in_first_pixel, int in_last_pixel, float in_center_of_out, stbir__contributors* contributor, float* coefficient_group)
   1036 {
   1037     int i;
   1038     float total_filter = 0;
   1039     float filter_scale;
   1040 
   1041     STBIR_ASSERT(in_last_pixel - in_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(1/scale) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical.
   1042 
   1043     contributor->n0 = in_first_pixel;
   1044     contributor->n1 = in_last_pixel;
   1045 
   1046     STBIR_ASSERT(contributor->n1 >= contributor->n0);
   1047 
   1048     for (i = 0; i <= in_last_pixel - in_first_pixel; i++)
   1049     {
   1050         float in_pixel_center = (float)(i + in_first_pixel) + 0.5f;
   1051         coefficient_group[i] = stbir__filter_info_table[filter].kernel(in_center_of_out - in_pixel_center, 1 / scale);
   1052 
   1053         // If the coefficient is zero, skip it. (Don't do the <0 check here, we want the influence of those outside pixels.)
   1054         if (i == 0 && !coefficient_group[i])
   1055         {
   1056             contributor->n0 = ++in_first_pixel;
   1057             i--;
   1058             continue;
   1059         }
   1060 
   1061         total_filter += coefficient_group[i];
   1062     }
   1063 
   1064     STBIR_ASSERT(stbir__filter_info_table[filter].kernel((float)(in_last_pixel + 1) + 0.5f - in_center_of_out, 1/scale) == 0);
   1065 
   1066     STBIR_ASSERT(total_filter > 0.9);
   1067     STBIR_ASSERT(total_filter < 1.1f); // Make sure it's not way off.
   1068 
   1069     // Make sure the sum of all coefficients is 1.
   1070     filter_scale = 1 / total_filter;
   1071 
   1072     for (i = 0; i <= in_last_pixel - in_first_pixel; i++)
   1073         coefficient_group[i] *= filter_scale;
   1074 
   1075     for (i = in_last_pixel - in_first_pixel; i >= 0; i--)
   1076     {
   1077         if (coefficient_group[i])
   1078             break;
   1079 
   1080         // This line has no weight. We can skip it.
   1081         contributor->n1 = contributor->n0 + i - 1;
   1082     }
   1083 }
   1084 
   1085 static void stbir__calculate_coefficients_downsample(stbir_filter filter, float scale_ratio, int out_first_pixel, int out_last_pixel, float out_center_of_in, stbir__contributors* contributor, float* coefficient_group)
   1086 {
   1087     int i;
   1088 
   1089      STBIR_ASSERT(out_last_pixel - out_first_pixel <= (int)ceil(stbir__filter_info_table[filter].support(scale_ratio) * 2)); // Taken directly from stbir__get_coefficient_width() which we can't call because we don't know if we're horizontal or vertical.
   1090 
   1091     contributor->n0 = out_first_pixel;
   1092     contributor->n1 = out_last_pixel;
   1093 
   1094     STBIR_ASSERT(contributor->n1 >= contributor->n0);
   1095 
   1096     for (i = 0; i <= out_last_pixel - out_first_pixel; i++)
   1097     {
   1098         float out_pixel_center = (float)(i + out_first_pixel) + 0.5f;
   1099         float x = out_pixel_center - out_center_of_in;
   1100         coefficient_group[i] = stbir__filter_info_table[filter].kernel(x, scale_ratio) * scale_ratio;
   1101     }
   1102 
   1103     STBIR_ASSERT(stbir__filter_info_table[filter].kernel((float)(out_last_pixel + 1) + 0.5f - out_center_of_in, scale_ratio) == 0);
   1104 
   1105     for (i = out_last_pixel - out_first_pixel; i >= 0; i--)
   1106     {
   1107         if (coefficient_group[i])
   1108             break;
   1109 
   1110         // This line has no weight. We can skip it.
   1111         contributor->n1 = contributor->n0 + i - 1;
   1112     }
   1113 }
   1114 
   1115 static void stbir__normalize_downsample_coefficients(stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, int input_size, int output_size)
   1116 {
   1117     int num_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size);
   1118     int num_coefficients = stbir__get_coefficient_width(filter, scale_ratio);
   1119     int i, j;
   1120     int skip;
   1121 
   1122     for (i = 0; i < output_size; i++)
   1123     {
   1124         float scale;
   1125         float total = 0;
   1126 
   1127         for (j = 0; j < num_contributors; j++)
   1128         {
   1129             if (i >= contributors[j].n0 && i <= contributors[j].n1)
   1130             {
   1131                 float coefficient = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0);
   1132                 total += coefficient;
   1133             }
   1134             else if (i < contributors[j].n0)
   1135                 break;
   1136         }
   1137 
   1138         STBIR_ASSERT(total > 0.9f);
   1139         STBIR_ASSERT(total < 1.1f);
   1140 
   1141         scale = 1 / total;
   1142 
   1143         for (j = 0; j < num_contributors; j++)
   1144         {
   1145             if (i >= contributors[j].n0 && i <= contributors[j].n1)
   1146                 *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i - contributors[j].n0) *= scale;
   1147             else if (i < contributors[j].n0)
   1148                 break;
   1149         }
   1150     }
   1151 
   1152     // Optimize: Skip zero coefficients and contributions outside of image bounds.
   1153     // Do this after normalizing because normalization depends on the n0/n1 values.
   1154     for (j = 0; j < num_contributors; j++)
   1155     {
   1156         int range, max, width;
   1157 
   1158         skip = 0;
   1159         while (*stbir__get_coefficient(coefficients, filter, scale_ratio, j, skip) == 0)
   1160             skip++;
   1161 
   1162         contributors[j].n0 += skip;
   1163 
   1164         while (contributors[j].n0 < 0)
   1165         {
   1166             contributors[j].n0++;
   1167             skip++;
   1168         }
   1169 
   1170         range = contributors[j].n1 - contributors[j].n0 + 1;
   1171         max = stbir__min(num_coefficients, range);
   1172 
   1173         width = stbir__get_coefficient_width(filter, scale_ratio);
   1174         for (i = 0; i < max; i++)
   1175         {
   1176             if (i + skip >= width)
   1177                 break;
   1178 
   1179             *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i) = *stbir__get_coefficient(coefficients, filter, scale_ratio, j, i + skip);
   1180         }
   1181 
   1182         continue;
   1183     }
   1184 
   1185     // Using min to avoid writing into invalid pixels.
   1186     for (i = 0; i < num_contributors; i++)
   1187         contributors[i].n1 = stbir__min(contributors[i].n1, output_size - 1);
   1188 }
   1189 
   1190 // Each scan line uses the same kernel values so we should calculate the kernel
   1191 // values once and then we can use them for every scan line.
   1192 static void stbir__calculate_filters(stbir__contributors* contributors, float* coefficients, stbir_filter filter, float scale_ratio, float shift, int input_size, int output_size)
   1193 {
   1194     int n;
   1195     int total_contributors = stbir__get_contributors(scale_ratio, filter, input_size, output_size);
   1196 
   1197     if (stbir__use_upsampling(scale_ratio))
   1198     {
   1199         float out_pixels_radius = stbir__filter_info_table[filter].support(1 / scale_ratio) * scale_ratio;
   1200 
   1201         // Looping through out pixels
   1202         for (n = 0; n < total_contributors; n++)
   1203         {
   1204             float in_center_of_out; // Center of the current out pixel in the in pixel space
   1205             int in_first_pixel, in_last_pixel;
   1206 
   1207             stbir__calculate_sample_range_upsample(n, out_pixels_radius, scale_ratio, shift, &in_first_pixel, &in_last_pixel, &in_center_of_out);
   1208 
   1209             stbir__calculate_coefficients_upsample(filter, scale_ratio, in_first_pixel, in_last_pixel, in_center_of_out, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0));
   1210         }
   1211     }
   1212     else
   1213     {
   1214         float in_pixels_radius = stbir__filter_info_table[filter].support(scale_ratio) / scale_ratio;
   1215 
   1216         // Looping through in pixels
   1217         for (n = 0; n < total_contributors; n++)
   1218         {
   1219             float out_center_of_in; // Center of the current out pixel in the in pixel space
   1220             int out_first_pixel, out_last_pixel;
   1221             int n_adjusted = n - stbir__get_filter_pixel_margin(filter, scale_ratio);
   1222 
   1223             stbir__calculate_sample_range_downsample(n_adjusted, in_pixels_radius, scale_ratio, shift, &out_first_pixel, &out_last_pixel, &out_center_of_in);
   1224 
   1225             stbir__calculate_coefficients_downsample(filter, scale_ratio, out_first_pixel, out_last_pixel, out_center_of_in, stbir__get_contributor(contributors, n), stbir__get_coefficient(coefficients, filter, scale_ratio, n, 0));
   1226         }
   1227 
   1228         stbir__normalize_downsample_coefficients(contributors, coefficients, filter, scale_ratio, input_size, output_size);
   1229     }
   1230 }
   1231 
   1232 static float* stbir__get_decode_buffer(stbir__info* stbir_info)
   1233 {
   1234     // The 0 index of the decode buffer starts after the margin. This makes
   1235     // it okay to use negative indexes on the decode buffer.
   1236     return &stbir_info->decode_buffer[stbir_info->horizontal_filter_pixel_margin * stbir_info->channels];
   1237 }
   1238 
   1239 #define STBIR__DECODE(type, colorspace) ((type) * (STBIR_MAX_COLORSPACES) + (colorspace))
   1240 
   1241 static void stbir__decode_scanline(stbir__info* stbir_info, int n)
   1242 {
   1243     int c;
   1244     int channels = stbir_info->channels;
   1245     int alpha_channel = stbir_info->alpha_channel;
   1246     int type = stbir_info->type;
   1247     int colorspace = stbir_info->colorspace;
   1248     int input_w = stbir_info->input_w;
   1249     size_t input_stride_bytes = stbir_info->input_stride_bytes;
   1250     float* decode_buffer = stbir__get_decode_buffer(stbir_info);
   1251     stbir_edge edge_horizontal = stbir_info->edge_horizontal;
   1252     stbir_edge edge_vertical = stbir_info->edge_vertical;
   1253     size_t in_buffer_row_offset = stbir__edge_wrap(edge_vertical, n, stbir_info->input_h) * input_stride_bytes;
   1254     const void* input_data = (char *) stbir_info->input_data + in_buffer_row_offset;
   1255     int max_x = input_w + stbir_info->horizontal_filter_pixel_margin;
   1256     int decode = STBIR__DECODE(type, colorspace);
   1257 
   1258     int x = -stbir_info->horizontal_filter_pixel_margin;
   1259 
   1260     // special handling for STBIR_EDGE_ZERO because it needs to return an item that doesn't appear in the input,
   1261     // and we want to avoid paying overhead on every pixel if not STBIR_EDGE_ZERO
   1262     if (edge_vertical == STBIR_EDGE_ZERO && (n < 0 || n >= stbir_info->input_h))
   1263     {
   1264         for (; x < max_x; x++)
   1265             for (c = 0; c < channels; c++)
   1266                 decode_buffer[x*channels + c] = 0;
   1267         return;
   1268     }
   1269 
   1270     switch (decode)
   1271     {
   1272     case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR):
   1273         for (; x < max_x; x++)
   1274         {
   1275             int decode_pixel_index = x * channels;
   1276             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1277             for (c = 0; c < channels; c++)
   1278                 decode_buffer[decode_pixel_index + c] = ((float)((const unsigned char*)input_data)[input_pixel_index + c]) / stbir__max_uint8_as_float;
   1279         }
   1280         break;
   1281 
   1282     case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB):
   1283         for (; x < max_x; x++)
   1284         {
   1285             int decode_pixel_index = x * channels;
   1286             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1287             for (c = 0; c < channels; c++)
   1288                 decode_buffer[decode_pixel_index + c] = stbir__srgb_uchar_to_linear_float[((const unsigned char*)input_data)[input_pixel_index + c]];
   1289 
   1290             if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1291                 decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned char*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint8_as_float;
   1292         }
   1293         break;
   1294 
   1295     case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR):
   1296         for (; x < max_x; x++)
   1297         {
   1298             int decode_pixel_index = x * channels;
   1299             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1300             for (c = 0; c < channels; c++)
   1301                 decode_buffer[decode_pixel_index + c] = ((float)((const unsigned short*)input_data)[input_pixel_index + c]) / stbir__max_uint16_as_float;
   1302         }
   1303         break;
   1304 
   1305     case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB):
   1306         for (; x < max_x; x++)
   1307         {
   1308             int decode_pixel_index = x * channels;
   1309             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1310             for (c = 0; c < channels; c++)
   1311                 decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((float)((const unsigned short*)input_data)[input_pixel_index + c]) / stbir__max_uint16_as_float);
   1312 
   1313             if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1314                 decode_buffer[decode_pixel_index + alpha_channel] = ((float)((const unsigned short*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint16_as_float;
   1315         }
   1316         break;
   1317 
   1318     case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR):
   1319         for (; x < max_x; x++)
   1320         {
   1321             int decode_pixel_index = x * channels;
   1322             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1323             for (c = 0; c < channels; c++)
   1324                 decode_buffer[decode_pixel_index + c] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / stbir__max_uint32_as_float);
   1325         }
   1326         break;
   1327 
   1328     case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB):
   1329         for (; x < max_x; x++)
   1330         {
   1331             int decode_pixel_index = x * channels;
   1332             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1333             for (c = 0; c < channels; c++)
   1334                 decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear((float)(((double)((const unsigned int*)input_data)[input_pixel_index + c]) / stbir__max_uint32_as_float));
   1335 
   1336             if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1337                 decode_buffer[decode_pixel_index + alpha_channel] = (float)(((double)((const unsigned int*)input_data)[input_pixel_index + alpha_channel]) / stbir__max_uint32_as_float);
   1338         }
   1339         break;
   1340 
   1341     case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR):
   1342         for (; x < max_x; x++)
   1343         {
   1344             int decode_pixel_index = x * channels;
   1345             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1346             for (c = 0; c < channels; c++)
   1347                 decode_buffer[decode_pixel_index + c] = ((const float*)input_data)[input_pixel_index + c];
   1348         }
   1349         break;
   1350 
   1351     case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB):
   1352         for (; x < max_x; x++)
   1353         {
   1354             int decode_pixel_index = x * channels;
   1355             int input_pixel_index = stbir__edge_wrap(edge_horizontal, x, input_w) * channels;
   1356             for (c = 0; c < channels; c++)
   1357                 decode_buffer[decode_pixel_index + c] = stbir__srgb_to_linear(((const float*)input_data)[input_pixel_index + c]);
   1358 
   1359             if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1360                 decode_buffer[decode_pixel_index + alpha_channel] = ((const float*)input_data)[input_pixel_index + alpha_channel];
   1361         }
   1362 
   1363         break;
   1364 
   1365     default:
   1366         STBIR_ASSERT(!"Unknown type/colorspace/channels combination.");
   1367         break;
   1368     }
   1369 
   1370     if (!(stbir_info->flags & STBIR_FLAG_ALPHA_PREMULTIPLIED))
   1371     {
   1372         for (x = -stbir_info->horizontal_filter_pixel_margin; x < max_x; x++)
   1373         {
   1374             int decode_pixel_index = x * channels;
   1375 
   1376             // If the alpha value is 0 it will clobber the color values. Make sure it's not.
   1377             float alpha = decode_buffer[decode_pixel_index + alpha_channel];
   1378 #ifndef STBIR_NO_ALPHA_EPSILON
   1379             if (stbir_info->type != STBIR_TYPE_FLOAT) {
   1380                 alpha += STBIR_ALPHA_EPSILON;
   1381                 decode_buffer[decode_pixel_index + alpha_channel] = alpha;
   1382             }
   1383 #endif
   1384             for (c = 0; c < channels; c++)
   1385             {
   1386                 if (c == alpha_channel)
   1387                     continue;
   1388 
   1389                 decode_buffer[decode_pixel_index + c] *= alpha;
   1390             }
   1391         }
   1392     }
   1393 
   1394     if (edge_horizontal == STBIR_EDGE_ZERO)
   1395     {
   1396         for (x = -stbir_info->horizontal_filter_pixel_margin; x < 0; x++)
   1397         {
   1398             for (c = 0; c < channels; c++)
   1399                 decode_buffer[x*channels + c] = 0;
   1400         }
   1401         for (x = input_w; x < max_x; x++)
   1402         {
   1403             for (c = 0; c < channels; c++)
   1404                 decode_buffer[x*channels + c] = 0;
   1405         }
   1406     }
   1407 }
   1408 
   1409 static float* stbir__get_ring_buffer_entry(float* ring_buffer, int index, int ring_buffer_length)
   1410 {
   1411     return &ring_buffer[index * ring_buffer_length];
   1412 }
   1413 
   1414 static float* stbir__add_empty_ring_buffer_entry(stbir__info* stbir_info, int n)
   1415 {
   1416     int ring_buffer_index;
   1417     float* ring_buffer;
   1418 
   1419     stbir_info->ring_buffer_last_scanline = n;
   1420 
   1421     if (stbir_info->ring_buffer_begin_index < 0)
   1422     {
   1423         ring_buffer_index = stbir_info->ring_buffer_begin_index = 0;
   1424         stbir_info->ring_buffer_first_scanline = n;
   1425     }
   1426     else
   1427     {
   1428         ring_buffer_index = (stbir_info->ring_buffer_begin_index + (stbir_info->ring_buffer_last_scanline - stbir_info->ring_buffer_first_scanline)) % stbir_info->ring_buffer_num_entries;
   1429         STBIR_ASSERT(ring_buffer_index != stbir_info->ring_buffer_begin_index);
   1430     }
   1431 
   1432     ring_buffer = stbir__get_ring_buffer_entry(stbir_info->ring_buffer, ring_buffer_index, stbir_info->ring_buffer_length_bytes / sizeof(float));
   1433     memset(ring_buffer, 0, stbir_info->ring_buffer_length_bytes);
   1434 
   1435     return ring_buffer;
   1436 }
   1437 
   1438 
   1439 static void stbir__resample_horizontal_upsample(stbir__info* stbir_info, float* output_buffer)
   1440 {
   1441     int x, k;
   1442     int output_w = stbir_info->output_w;
   1443     int channels = stbir_info->channels;
   1444     float* decode_buffer = stbir__get_decode_buffer(stbir_info);
   1445     stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors;
   1446     float* horizontal_coefficients = stbir_info->horizontal_coefficients;
   1447     int coefficient_width = stbir_info->horizontal_coefficient_width;
   1448 
   1449     for (x = 0; x < output_w; x++)
   1450     {
   1451         int n0 = horizontal_contributors[x].n0;
   1452         int n1 = horizontal_contributors[x].n1;
   1453 
   1454         int out_pixel_index = x * channels;
   1455         int coefficient_group = coefficient_width * x;
   1456         int coefficient_counter = 0;
   1457 
   1458         STBIR_ASSERT(n1 >= n0);
   1459         STBIR_ASSERT(n0 >= -stbir_info->horizontal_filter_pixel_margin);
   1460         STBIR_ASSERT(n1 >= -stbir_info->horizontal_filter_pixel_margin);
   1461         STBIR_ASSERT(n0 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin);
   1462         STBIR_ASSERT(n1 < stbir_info->input_w + stbir_info->horizontal_filter_pixel_margin);
   1463 
   1464         switch (channels) {
   1465             case 1:
   1466                 for (k = n0; k <= n1; k++)
   1467                 {
   1468                     int in_pixel_index = k * 1;
   1469                     float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
   1470                     STBIR_ASSERT(coefficient != 0);
   1471                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1472                 }
   1473                 break;
   1474             case 2:
   1475                 for (k = n0; k <= n1; k++)
   1476                 {
   1477                     int in_pixel_index = k * 2;
   1478                     float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
   1479                     STBIR_ASSERT(coefficient != 0);
   1480                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1481                     output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
   1482                 }
   1483                 break;
   1484             case 3:
   1485                 for (k = n0; k <= n1; k++)
   1486                 {
   1487                     int in_pixel_index = k * 3;
   1488                     float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
   1489                     STBIR_ASSERT(coefficient != 0);
   1490                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1491                     output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
   1492                     output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
   1493                 }
   1494                 break;
   1495             case 4:
   1496                 for (k = n0; k <= n1; k++)
   1497                 {
   1498                     int in_pixel_index = k * 4;
   1499                     float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
   1500                     STBIR_ASSERT(coefficient != 0);
   1501                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1502                     output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
   1503                     output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
   1504                     output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient;
   1505                 }
   1506                 break;
   1507             default:
   1508                 for (k = n0; k <= n1; k++)
   1509                 {
   1510                     int in_pixel_index = k * channels;
   1511                     float coefficient = horizontal_coefficients[coefficient_group + coefficient_counter++];
   1512                     int c;
   1513                     STBIR_ASSERT(coefficient != 0);
   1514                     for (c = 0; c < channels; c++)
   1515                         output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient;
   1516                 }
   1517                 break;
   1518         }
   1519     }
   1520 }
   1521 
   1522 static void stbir__resample_horizontal_downsample(stbir__info* stbir_info, float* output_buffer)
   1523 {
   1524     int x, k;
   1525     int input_w = stbir_info->input_w;
   1526     int channels = stbir_info->channels;
   1527     float* decode_buffer = stbir__get_decode_buffer(stbir_info);
   1528     stbir__contributors* horizontal_contributors = stbir_info->horizontal_contributors;
   1529     float* horizontal_coefficients = stbir_info->horizontal_coefficients;
   1530     int coefficient_width = stbir_info->horizontal_coefficient_width;
   1531     int filter_pixel_margin = stbir_info->horizontal_filter_pixel_margin;
   1532     int max_x = input_w + filter_pixel_margin * 2;
   1533 
   1534     STBIR_ASSERT(!stbir__use_width_upsampling(stbir_info));
   1535 
   1536     switch (channels) {
   1537         case 1:
   1538             for (x = 0; x < max_x; x++)
   1539             {
   1540                 int n0 = horizontal_contributors[x].n0;
   1541                 int n1 = horizontal_contributors[x].n1;
   1542 
   1543                 int in_x = x - filter_pixel_margin;
   1544                 int in_pixel_index = in_x * 1;
   1545                 int max_n = n1;
   1546                 int coefficient_group = coefficient_width * x;
   1547 
   1548                 for (k = n0; k <= max_n; k++)
   1549                 {
   1550                     int out_pixel_index = k * 1;
   1551                     float coefficient = horizontal_coefficients[coefficient_group + k - n0];
   1552                     STBIR_ASSERT(coefficient != 0);
   1553                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1554                 }
   1555             }
   1556             break;
   1557 
   1558         case 2:
   1559             for (x = 0; x < max_x; x++)
   1560             {
   1561                 int n0 = horizontal_contributors[x].n0;
   1562                 int n1 = horizontal_contributors[x].n1;
   1563 
   1564                 int in_x = x - filter_pixel_margin;
   1565                 int in_pixel_index = in_x * 2;
   1566                 int max_n = n1;
   1567                 int coefficient_group = coefficient_width * x;
   1568 
   1569                 for (k = n0; k <= max_n; k++)
   1570                 {
   1571                     int out_pixel_index = k * 2;
   1572                     float coefficient = horizontal_coefficients[coefficient_group + k - n0];
   1573                     STBIR_ASSERT(coefficient != 0);
   1574                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1575                     output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
   1576                 }
   1577             }
   1578             break;
   1579 
   1580         case 3:
   1581             for (x = 0; x < max_x; x++)
   1582             {
   1583                 int n0 = horizontal_contributors[x].n0;
   1584                 int n1 = horizontal_contributors[x].n1;
   1585 
   1586                 int in_x = x - filter_pixel_margin;
   1587                 int in_pixel_index = in_x * 3;
   1588                 int max_n = n1;
   1589                 int coefficient_group = coefficient_width * x;
   1590 
   1591                 for (k = n0; k <= max_n; k++)
   1592                 {
   1593                     int out_pixel_index = k * 3;
   1594                     float coefficient = horizontal_coefficients[coefficient_group + k - n0];
   1595                     STBIR_ASSERT(coefficient != 0);
   1596                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1597                     output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
   1598                     output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
   1599                 }
   1600             }
   1601             break;
   1602 
   1603         case 4:
   1604             for (x = 0; x < max_x; x++)
   1605             {
   1606                 int n0 = horizontal_contributors[x].n0;
   1607                 int n1 = horizontal_contributors[x].n1;
   1608 
   1609                 int in_x = x - filter_pixel_margin;
   1610                 int in_pixel_index = in_x * 4;
   1611                 int max_n = n1;
   1612                 int coefficient_group = coefficient_width * x;
   1613 
   1614                 for (k = n0; k <= max_n; k++)
   1615                 {
   1616                     int out_pixel_index = k * 4;
   1617                     float coefficient = horizontal_coefficients[coefficient_group + k - n0];
   1618                     STBIR_ASSERT(coefficient != 0);
   1619                     output_buffer[out_pixel_index + 0] += decode_buffer[in_pixel_index + 0] * coefficient;
   1620                     output_buffer[out_pixel_index + 1] += decode_buffer[in_pixel_index + 1] * coefficient;
   1621                     output_buffer[out_pixel_index + 2] += decode_buffer[in_pixel_index + 2] * coefficient;
   1622                     output_buffer[out_pixel_index + 3] += decode_buffer[in_pixel_index + 3] * coefficient;
   1623                 }
   1624             }
   1625             break;
   1626 
   1627         default:
   1628             for (x = 0; x < max_x; x++)
   1629             {
   1630                 int n0 = horizontal_contributors[x].n0;
   1631                 int n1 = horizontal_contributors[x].n1;
   1632 
   1633                 int in_x = x - filter_pixel_margin;
   1634                 int in_pixel_index = in_x * channels;
   1635                 int max_n = n1;
   1636                 int coefficient_group = coefficient_width * x;
   1637 
   1638                 for (k = n0; k <= max_n; k++)
   1639                 {
   1640                     int c;
   1641                     int out_pixel_index = k * channels;
   1642                     float coefficient = horizontal_coefficients[coefficient_group + k - n0];
   1643                     STBIR_ASSERT(coefficient != 0);
   1644                     for (c = 0; c < channels; c++)
   1645                         output_buffer[out_pixel_index + c] += decode_buffer[in_pixel_index + c] * coefficient;
   1646                 }
   1647             }
   1648             break;
   1649     }
   1650 }
   1651 
   1652 static void stbir__decode_and_resample_upsample(stbir__info* stbir_info, int n)
   1653 {
   1654     // Decode the nth scanline from the source image into the decode buffer.
   1655     stbir__decode_scanline(stbir_info, n);
   1656 
   1657     // Now resample it into the ring buffer.
   1658     if (stbir__use_width_upsampling(stbir_info))
   1659         stbir__resample_horizontal_upsample(stbir_info, stbir__add_empty_ring_buffer_entry(stbir_info, n));
   1660     else
   1661         stbir__resample_horizontal_downsample(stbir_info, stbir__add_empty_ring_buffer_entry(stbir_info, n));
   1662 
   1663     // Now it's sitting in the ring buffer ready to be used as source for the vertical sampling.
   1664 }
   1665 
   1666 static void stbir__decode_and_resample_downsample(stbir__info* stbir_info, int n)
   1667 {
   1668     // Decode the nth scanline from the source image into the decode buffer.
   1669     stbir__decode_scanline(stbir_info, n);
   1670 
   1671     memset(stbir_info->horizontal_buffer, 0, stbir_info->output_w * stbir_info->channels * sizeof(float));
   1672 
   1673     // Now resample it into the horizontal buffer.
   1674     if (stbir__use_width_upsampling(stbir_info))
   1675         stbir__resample_horizontal_upsample(stbir_info, stbir_info->horizontal_buffer);
   1676     else
   1677         stbir__resample_horizontal_downsample(stbir_info, stbir_info->horizontal_buffer);
   1678 
   1679     // Now it's sitting in the horizontal buffer ready to be distributed into the ring buffers.
   1680 }
   1681 
   1682 // Get the specified scan line from the ring buffer.
   1683 static float* stbir__get_ring_buffer_scanline(int get_scanline, float* ring_buffer, int begin_index, int first_scanline, int ring_buffer_num_entries, int ring_buffer_length)
   1684 {
   1685     int ring_buffer_index = (begin_index + (get_scanline - first_scanline)) % ring_buffer_num_entries;
   1686     return stbir__get_ring_buffer_entry(ring_buffer, ring_buffer_index, ring_buffer_length);
   1687 }
   1688 
   1689 
   1690 static void stbir__encode_scanline(stbir__info* stbir_info, int num_pixels, void *output_buffer, float *encode_buffer, int channels, int alpha_channel, int decode)
   1691 {
   1692     int x;
   1693     int n;
   1694     int num_nonalpha;
   1695     stbir_uint16 nonalpha[STBIR_MAX_CHANNELS];
   1696 
   1697     if (!(stbir_info->flags&STBIR_FLAG_ALPHA_PREMULTIPLIED))
   1698     {
   1699         for (x=0; x < num_pixels; ++x)
   1700         {
   1701             int pixel_index = x*channels;
   1702 
   1703             float alpha = encode_buffer[pixel_index + alpha_channel];
   1704             float reciprocal_alpha = alpha ? 1.0f / alpha : 0;
   1705 
   1706             // unrolling this produced a 1% slowdown upscaling a large RGBA linear-space image on my machine - stb
   1707             for (n = 0; n < channels; n++)
   1708                 if (n != alpha_channel)
   1709                     encode_buffer[pixel_index + n] *= reciprocal_alpha;
   1710 
   1711             // We added in a small epsilon to prevent the color channel from being deleted with zero alpha.
   1712             // Because we only add it for integer types, it will automatically be discarded on integer
   1713             // conversion, so we don't need to subtract it back out (which would be problematic for
   1714             // numeric precision reasons).
   1715         }
   1716     }
   1717 
   1718     // build a table of all channels that need colorspace correction, so
   1719     // we don't perform colorspace correction on channels that don't need it.
   1720     for (x = 0, num_nonalpha = 0; x < channels; ++x)
   1721     {
   1722         if (x != alpha_channel || (stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1723         {
   1724             nonalpha[num_nonalpha++] = (stbir_uint16)x;
   1725         }
   1726     }
   1727 
   1728     #define STBIR__ROUND_INT(f)    ((int)          ((f)+0.5))
   1729     #define STBIR__ROUND_UINT(f)   ((stbir_uint32) ((f)+0.5))
   1730 
   1731     #ifdef STBIR__SATURATE_INT
   1732     #define STBIR__ENCODE_LINEAR8(f)   stbir__saturate8 (STBIR__ROUND_INT((f) * stbir__max_uint8_as_float ))
   1733     #define STBIR__ENCODE_LINEAR16(f)  stbir__saturate16(STBIR__ROUND_INT((f) * stbir__max_uint16_as_float))
   1734     #else
   1735     #define STBIR__ENCODE_LINEAR8(f)   (unsigned char ) STBIR__ROUND_INT(stbir__saturate(f) * stbir__max_uint8_as_float )
   1736     #define STBIR__ENCODE_LINEAR16(f)  (unsigned short) STBIR__ROUND_INT(stbir__saturate(f) * stbir__max_uint16_as_float)
   1737     #endif
   1738 
   1739     switch (decode)
   1740     {
   1741         case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_LINEAR):
   1742             for (x=0; x < num_pixels; ++x)
   1743             {
   1744                 int pixel_index = x*channels;
   1745 
   1746                 for (n = 0; n < channels; n++)
   1747                 {
   1748                     int index = pixel_index + n;
   1749                     ((unsigned char*)output_buffer)[index] = STBIR__ENCODE_LINEAR8(encode_buffer[index]);
   1750                 }
   1751             }
   1752             break;
   1753 
   1754         case STBIR__DECODE(STBIR_TYPE_UINT8, STBIR_COLORSPACE_SRGB):
   1755             for (x=0; x < num_pixels; ++x)
   1756             {
   1757                 int pixel_index = x*channels;
   1758 
   1759                 for (n = 0; n < num_nonalpha; n++)
   1760                 {
   1761                     int index = pixel_index + nonalpha[n];
   1762                     ((unsigned char*)output_buffer)[index] = stbir__linear_to_srgb_uchar(encode_buffer[index]);
   1763                 }
   1764 
   1765                 if (!(stbir_info->flags & STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1766                     ((unsigned char *)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR8(encode_buffer[pixel_index+alpha_channel]);
   1767             }
   1768             break;
   1769 
   1770         case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_LINEAR):
   1771             for (x=0; x < num_pixels; ++x)
   1772             {
   1773                 int pixel_index = x*channels;
   1774 
   1775                 for (n = 0; n < channels; n++)
   1776                 {
   1777                     int index = pixel_index + n;
   1778                     ((unsigned short*)output_buffer)[index] = STBIR__ENCODE_LINEAR16(encode_buffer[index]);
   1779                 }
   1780             }
   1781             break;
   1782 
   1783         case STBIR__DECODE(STBIR_TYPE_UINT16, STBIR_COLORSPACE_SRGB):
   1784             for (x=0; x < num_pixels; ++x)
   1785             {
   1786                 int pixel_index = x*channels;
   1787 
   1788                 for (n = 0; n < num_nonalpha; n++)
   1789                 {
   1790                     int index = pixel_index + nonalpha[n];
   1791                     ((unsigned short*)output_buffer)[index] = (unsigned short)STBIR__ROUND_INT(stbir__linear_to_srgb(stbir__saturate(encode_buffer[index])) * stbir__max_uint16_as_float);
   1792                 }
   1793 
   1794                 if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1795                     ((unsigned short*)output_buffer)[pixel_index + alpha_channel] = STBIR__ENCODE_LINEAR16(encode_buffer[pixel_index + alpha_channel]);
   1796             }
   1797 
   1798             break;
   1799 
   1800         case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_LINEAR):
   1801             for (x=0; x < num_pixels; ++x)
   1802             {
   1803                 int pixel_index = x*channels;
   1804 
   1805                 for (n = 0; n < channels; n++)
   1806                 {
   1807                     int index = pixel_index + n;
   1808                     ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__saturate(encode_buffer[index])) * stbir__max_uint32_as_float);
   1809                 }
   1810             }
   1811             break;
   1812 
   1813         case STBIR__DECODE(STBIR_TYPE_UINT32, STBIR_COLORSPACE_SRGB):
   1814             for (x=0; x < num_pixels; ++x)
   1815             {
   1816                 int pixel_index = x*channels;
   1817 
   1818                 for (n = 0; n < num_nonalpha; n++)
   1819                 {
   1820                     int index = pixel_index + nonalpha[n];
   1821                     ((unsigned int*)output_buffer)[index] = (unsigned int)STBIR__ROUND_UINT(((double)stbir__linear_to_srgb(stbir__saturate(encode_buffer[index]))) * stbir__max_uint32_as_float);
   1822                 }
   1823 
   1824                 if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1825                     ((unsigned int*)output_buffer)[pixel_index + alpha_channel] = (unsigned int)STBIR__ROUND_INT(((double)stbir__saturate(encode_buffer[pixel_index + alpha_channel])) * stbir__max_uint32_as_float);
   1826             }
   1827             break;
   1828 
   1829         case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_LINEAR):
   1830             for (x=0; x < num_pixels; ++x)
   1831             {
   1832                 int pixel_index = x*channels;
   1833 
   1834                 for (n = 0; n < channels; n++)
   1835                 {
   1836                     int index = pixel_index + n;
   1837                     ((float*)output_buffer)[index] = encode_buffer[index];
   1838                 }
   1839             }
   1840             break;
   1841 
   1842         case STBIR__DECODE(STBIR_TYPE_FLOAT, STBIR_COLORSPACE_SRGB):
   1843             for (x=0; x < num_pixels; ++x)
   1844             {
   1845                 int pixel_index = x*channels;
   1846 
   1847                 for (n = 0; n < num_nonalpha; n++)
   1848                 {
   1849                     int index = pixel_index + nonalpha[n];
   1850                     ((float*)output_buffer)[index] = stbir__linear_to_srgb(encode_buffer[index]);
   1851                 }
   1852 
   1853                 if (!(stbir_info->flags&STBIR_FLAG_ALPHA_USES_COLORSPACE))
   1854                     ((float*)output_buffer)[pixel_index + alpha_channel] = encode_buffer[pixel_index + alpha_channel];
   1855             }
   1856             break;
   1857 
   1858         default:
   1859             STBIR_ASSERT(!"Unknown type/colorspace/channels combination.");
   1860             break;
   1861     }
   1862 }
   1863 
   1864 static void stbir__resample_vertical_upsample(stbir__info* stbir_info, int n)
   1865 {
   1866     int x, k;
   1867     int output_w = stbir_info->output_w;
   1868     stbir__contributors* vertical_contributors = stbir_info->vertical_contributors;
   1869     float* vertical_coefficients = stbir_info->vertical_coefficients;
   1870     int channels = stbir_info->channels;
   1871     int alpha_channel = stbir_info->alpha_channel;
   1872     int type = stbir_info->type;
   1873     int colorspace = stbir_info->colorspace;
   1874     int ring_buffer_entries = stbir_info->ring_buffer_num_entries;
   1875     void* output_data = stbir_info->output_data;
   1876     float* encode_buffer = stbir_info->encode_buffer;
   1877     int decode = STBIR__DECODE(type, colorspace);
   1878     int coefficient_width = stbir_info->vertical_coefficient_width;
   1879     int coefficient_counter;
   1880     int contributor = n;
   1881 
   1882     float* ring_buffer = stbir_info->ring_buffer;
   1883     int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index;
   1884     int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline;
   1885     int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float);
   1886 
   1887     int n0,n1, output_row_start;
   1888     int coefficient_group = coefficient_width * contributor;
   1889 
   1890     n0 = vertical_contributors[contributor].n0;
   1891     n1 = vertical_contributors[contributor].n1;
   1892 
   1893     output_row_start = n * stbir_info->output_stride_bytes;
   1894 
   1895     STBIR_ASSERT(stbir__use_height_upsampling(stbir_info));
   1896 
   1897     memset(encode_buffer, 0, output_w * sizeof(float) * channels);
   1898 
   1899     // I tried reblocking this for better cache usage of encode_buffer
   1900     // (using x_outer, k, x_inner), but it lost speed. -- stb
   1901 
   1902     coefficient_counter = 0;
   1903     switch (channels) {
   1904         case 1:
   1905             for (k = n0; k <= n1; k++)
   1906             {
   1907                 int coefficient_index = coefficient_counter++;
   1908                 float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length);
   1909                 float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
   1910                 for (x = 0; x < output_w; ++x)
   1911                 {
   1912                     int in_pixel_index = x * 1;
   1913                     encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
   1914                 }
   1915             }
   1916             break;
   1917         case 2:
   1918             for (k = n0; k <= n1; k++)
   1919             {
   1920                 int coefficient_index = coefficient_counter++;
   1921                 float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length);
   1922                 float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
   1923                 for (x = 0; x < output_w; ++x)
   1924                 {
   1925                     int in_pixel_index = x * 2;
   1926                     encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
   1927                     encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient;
   1928                 }
   1929             }
   1930             break;
   1931         case 3:
   1932             for (k = n0; k <= n1; k++)
   1933             {
   1934                 int coefficient_index = coefficient_counter++;
   1935                 float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length);
   1936                 float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
   1937                 for (x = 0; x < output_w; ++x)
   1938                 {
   1939                     int in_pixel_index = x * 3;
   1940                     encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
   1941                     encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient;
   1942                     encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient;
   1943                 }
   1944             }
   1945             break;
   1946         case 4:
   1947             for (k = n0; k <= n1; k++)
   1948             {
   1949                 int coefficient_index = coefficient_counter++;
   1950                 float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length);
   1951                 float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
   1952                 for (x = 0; x < output_w; ++x)
   1953                 {
   1954                     int in_pixel_index = x * 4;
   1955                     encode_buffer[in_pixel_index + 0] += ring_buffer_entry[in_pixel_index + 0] * coefficient;
   1956                     encode_buffer[in_pixel_index + 1] += ring_buffer_entry[in_pixel_index + 1] * coefficient;
   1957                     encode_buffer[in_pixel_index + 2] += ring_buffer_entry[in_pixel_index + 2] * coefficient;
   1958                     encode_buffer[in_pixel_index + 3] += ring_buffer_entry[in_pixel_index + 3] * coefficient;
   1959                 }
   1960             }
   1961             break;
   1962         default:
   1963             for (k = n0; k <= n1; k++)
   1964             {
   1965                 int coefficient_index = coefficient_counter++;
   1966                 float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length);
   1967                 float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
   1968                 for (x = 0; x < output_w; ++x)
   1969                 {
   1970                     int in_pixel_index = x * channels;
   1971                     int c;
   1972                     for (c = 0; c < channels; c++)
   1973                         encode_buffer[in_pixel_index + c] += ring_buffer_entry[in_pixel_index + c] * coefficient;
   1974                 }
   1975             }
   1976             break;
   1977     }
   1978     stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, encode_buffer, channels, alpha_channel, decode);
   1979 }
   1980 
   1981 static void stbir__resample_vertical_downsample(stbir__info* stbir_info, int n)
   1982 {
   1983     int x, k;
   1984     int output_w = stbir_info->output_w;
   1985     stbir__contributors* vertical_contributors = stbir_info->vertical_contributors;
   1986     float* vertical_coefficients = stbir_info->vertical_coefficients;
   1987     int channels = stbir_info->channels;
   1988     int ring_buffer_entries = stbir_info->ring_buffer_num_entries;
   1989     float* horizontal_buffer = stbir_info->horizontal_buffer;
   1990     int coefficient_width = stbir_info->vertical_coefficient_width;
   1991     int contributor = n + stbir_info->vertical_filter_pixel_margin;
   1992 
   1993     float* ring_buffer = stbir_info->ring_buffer;
   1994     int ring_buffer_begin_index = stbir_info->ring_buffer_begin_index;
   1995     int ring_buffer_first_scanline = stbir_info->ring_buffer_first_scanline;
   1996     int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float);
   1997     int n0,n1;
   1998 
   1999     n0 = vertical_contributors[contributor].n0;
   2000     n1 = vertical_contributors[contributor].n1;
   2001 
   2002     STBIR_ASSERT(!stbir__use_height_upsampling(stbir_info));
   2003 
   2004     for (k = n0; k <= n1; k++)
   2005     {
   2006         int coefficient_index = k - n0;
   2007         int coefficient_group = coefficient_width * contributor;
   2008         float coefficient = vertical_coefficients[coefficient_group + coefficient_index];
   2009 
   2010         float* ring_buffer_entry = stbir__get_ring_buffer_scanline(k, ring_buffer, ring_buffer_begin_index, ring_buffer_first_scanline, ring_buffer_entries, ring_buffer_length);
   2011 
   2012         switch (channels) {
   2013             case 1:
   2014                 for (x = 0; x < output_w; x++)
   2015                 {
   2016                     int in_pixel_index = x * 1;
   2017                     ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
   2018                 }
   2019                 break;
   2020             case 2:
   2021                 for (x = 0; x < output_w; x++)
   2022                 {
   2023                     int in_pixel_index = x * 2;
   2024                     ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
   2025                     ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient;
   2026                 }
   2027                 break;
   2028             case 3:
   2029                 for (x = 0; x < output_w; x++)
   2030                 {
   2031                     int in_pixel_index = x * 3;
   2032                     ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
   2033                     ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient;
   2034                     ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient;
   2035                 }
   2036                 break;
   2037             case 4:
   2038                 for (x = 0; x < output_w; x++)
   2039                 {
   2040                     int in_pixel_index = x * 4;
   2041                     ring_buffer_entry[in_pixel_index + 0] += horizontal_buffer[in_pixel_index + 0] * coefficient;
   2042                     ring_buffer_entry[in_pixel_index + 1] += horizontal_buffer[in_pixel_index + 1] * coefficient;
   2043                     ring_buffer_entry[in_pixel_index + 2] += horizontal_buffer[in_pixel_index + 2] * coefficient;
   2044                     ring_buffer_entry[in_pixel_index + 3] += horizontal_buffer[in_pixel_index + 3] * coefficient;
   2045                 }
   2046                 break;
   2047             default:
   2048                 for (x = 0; x < output_w; x++)
   2049                 {
   2050                     int in_pixel_index = x * channels;
   2051 
   2052                     int c;
   2053                     for (c = 0; c < channels; c++)
   2054                         ring_buffer_entry[in_pixel_index + c] += horizontal_buffer[in_pixel_index + c] * coefficient;
   2055                 }
   2056                 break;
   2057         }
   2058     }
   2059 }
   2060 
   2061 static void stbir__buffer_loop_upsample(stbir__info* stbir_info)
   2062 {
   2063     int y;
   2064     float scale_ratio = stbir_info->vertical_scale;
   2065     float out_scanlines_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(1/scale_ratio) * scale_ratio;
   2066 
   2067     STBIR_ASSERT(stbir__use_height_upsampling(stbir_info));
   2068 
   2069     for (y = 0; y < stbir_info->output_h; y++)
   2070     {
   2071         float in_center_of_out = 0; // Center of the current out scanline in the in scanline space
   2072         int in_first_scanline = 0, in_last_scanline = 0;
   2073 
   2074         stbir__calculate_sample_range_upsample(y, out_scanlines_radius, scale_ratio, stbir_info->vertical_shift, &in_first_scanline, &in_last_scanline, &in_center_of_out);
   2075 
   2076         STBIR_ASSERT(in_last_scanline - in_first_scanline + 1 <= stbir_info->ring_buffer_num_entries);
   2077 
   2078         if (stbir_info->ring_buffer_begin_index >= 0)
   2079         {
   2080             // Get rid of whatever we don't need anymore.
   2081             while (in_first_scanline > stbir_info->ring_buffer_first_scanline)
   2082             {
   2083                 if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline)
   2084                 {
   2085                     // We just popped the last scanline off the ring buffer.
   2086                     // Reset it to the empty state.
   2087                     stbir_info->ring_buffer_begin_index = -1;
   2088                     stbir_info->ring_buffer_first_scanline = 0;
   2089                     stbir_info->ring_buffer_last_scanline = 0;
   2090                     break;
   2091                 }
   2092                 else
   2093                 {
   2094                     stbir_info->ring_buffer_first_scanline++;
   2095                     stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->ring_buffer_num_entries;
   2096                 }
   2097             }
   2098         }
   2099 
   2100         // Load in new ones.
   2101         if (stbir_info->ring_buffer_begin_index < 0)
   2102             stbir__decode_and_resample_upsample(stbir_info, in_first_scanline);
   2103 
   2104         while (in_last_scanline > stbir_info->ring_buffer_last_scanline)
   2105             stbir__decode_and_resample_upsample(stbir_info, stbir_info->ring_buffer_last_scanline + 1);
   2106 
   2107         // Now all buffers should be ready to write a row of vertical sampling.
   2108         stbir__resample_vertical_upsample(stbir_info, y);
   2109 
   2110         STBIR_PROGRESS_REPORT((float)y / stbir_info->output_h);
   2111     }
   2112 }
   2113 
   2114 static void stbir__empty_ring_buffer(stbir__info* stbir_info, int first_necessary_scanline)
   2115 {
   2116     int output_stride_bytes = stbir_info->output_stride_bytes;
   2117     int channels = stbir_info->channels;
   2118     int alpha_channel = stbir_info->alpha_channel;
   2119     int type = stbir_info->type;
   2120     int colorspace = stbir_info->colorspace;
   2121     int output_w = stbir_info->output_w;
   2122     void* output_data = stbir_info->output_data;
   2123     int decode = STBIR__DECODE(type, colorspace);
   2124 
   2125     float* ring_buffer = stbir_info->ring_buffer;
   2126     int ring_buffer_length = stbir_info->ring_buffer_length_bytes/sizeof(float);
   2127 
   2128     if (stbir_info->ring_buffer_begin_index >= 0)
   2129     {
   2130         // Get rid of whatever we don't need anymore.
   2131         while (first_necessary_scanline > stbir_info->ring_buffer_first_scanline)
   2132         {
   2133             if (stbir_info->ring_buffer_first_scanline >= 0 && stbir_info->ring_buffer_first_scanline < stbir_info->output_h)
   2134             {
   2135                 int output_row_start = stbir_info->ring_buffer_first_scanline * output_stride_bytes;
   2136                 float* ring_buffer_entry = stbir__get_ring_buffer_entry(ring_buffer, stbir_info->ring_buffer_begin_index, ring_buffer_length);
   2137                 stbir__encode_scanline(stbir_info, output_w, (char *) output_data + output_row_start, ring_buffer_entry, channels, alpha_channel, decode);
   2138                 STBIR_PROGRESS_REPORT((float)stbir_info->ring_buffer_first_scanline / stbir_info->output_h);
   2139             }
   2140 
   2141             if (stbir_info->ring_buffer_first_scanline == stbir_info->ring_buffer_last_scanline)
   2142             {
   2143                 // We just popped the last scanline off the ring buffer.
   2144                 // Reset it to the empty state.
   2145                 stbir_info->ring_buffer_begin_index = -1;
   2146                 stbir_info->ring_buffer_first_scanline = 0;
   2147                 stbir_info->ring_buffer_last_scanline = 0;
   2148                 break;
   2149             }
   2150             else
   2151             {
   2152                 stbir_info->ring_buffer_first_scanline++;
   2153                 stbir_info->ring_buffer_begin_index = (stbir_info->ring_buffer_begin_index + 1) % stbir_info->ring_buffer_num_entries;
   2154             }
   2155         }
   2156     }
   2157 }
   2158 
   2159 static void stbir__buffer_loop_downsample(stbir__info* stbir_info)
   2160 {
   2161     int y;
   2162     float scale_ratio = stbir_info->vertical_scale;
   2163     int output_h = stbir_info->output_h;
   2164     float in_pixels_radius = stbir__filter_info_table[stbir_info->vertical_filter].support(scale_ratio) / scale_ratio;
   2165     int pixel_margin = stbir_info->vertical_filter_pixel_margin;
   2166     int max_y = stbir_info->input_h + pixel_margin;
   2167 
   2168     STBIR_ASSERT(!stbir__use_height_upsampling(stbir_info));
   2169 
   2170     for (y = -pixel_margin; y < max_y; y++)
   2171     {
   2172         float out_center_of_in; // Center of the current out scanline in the in scanline space
   2173         int out_first_scanline, out_last_scanline;
   2174 
   2175         stbir__calculate_sample_range_downsample(y, in_pixels_radius, scale_ratio, stbir_info->vertical_shift, &out_first_scanline, &out_last_scanline, &out_center_of_in);
   2176 
   2177         STBIR_ASSERT(out_last_scanline - out_first_scanline + 1 <= stbir_info->ring_buffer_num_entries);
   2178 
   2179         if (out_last_scanline < 0 || out_first_scanline >= output_h)
   2180             continue;
   2181 
   2182         stbir__empty_ring_buffer(stbir_info, out_first_scanline);
   2183 
   2184         stbir__decode_and_resample_downsample(stbir_info, y);
   2185 
   2186         // Load in new ones.
   2187         if (stbir_info->ring_buffer_begin_index < 0)
   2188             stbir__add_empty_ring_buffer_entry(stbir_info, out_first_scanline);
   2189 
   2190         while (out_last_scanline > stbir_info->ring_buffer_last_scanline)
   2191             stbir__add_empty_ring_buffer_entry(stbir_info, stbir_info->ring_buffer_last_scanline + 1);
   2192 
   2193         // Now the horizontal buffer is ready to write to all ring buffer rows.
   2194         stbir__resample_vertical_downsample(stbir_info, y);
   2195     }
   2196 
   2197     stbir__empty_ring_buffer(stbir_info, stbir_info->output_h);
   2198 }
   2199 
   2200 static void stbir__setup(stbir__info *info, int input_w, int input_h, int output_w, int output_h, int channels)
   2201 {
   2202     info->input_w = input_w;
   2203     info->input_h = input_h;
   2204     info->output_w = output_w;
   2205     info->output_h = output_h;
   2206     info->channels = channels;
   2207 }
   2208 
   2209 static void stbir__calculate_transform(stbir__info *info, float s0, float t0, float s1, float t1, float *transform)
   2210 {
   2211     info->s0 = s0;
   2212     info->t0 = t0;
   2213     info->s1 = s1;
   2214     info->t1 = t1;
   2215 
   2216     if (transform)
   2217     {
   2218         info->horizontal_scale = transform[0];
   2219         info->vertical_scale   = transform[1];
   2220         info->horizontal_shift = transform[2];
   2221         info->vertical_shift   = transform[3];
   2222     }
   2223     else
   2224     {
   2225         info->horizontal_scale = ((float)info->output_w / info->input_w) / (s1 - s0);
   2226         info->vertical_scale = ((float)info->output_h / info->input_h) / (t1 - t0);
   2227 
   2228         info->horizontal_shift = s0 * info->output_w / (s1 - s0);
   2229         info->vertical_shift = t0 * info->output_h / (t1 - t0);
   2230     }
   2231 }
   2232 
   2233 static void stbir__choose_filter(stbir__info *info, stbir_filter h_filter, stbir_filter v_filter)
   2234 {
   2235     if (h_filter == 0)
   2236         h_filter = stbir__use_upsampling(info->horizontal_scale) ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE;
   2237     if (v_filter == 0)
   2238         v_filter = stbir__use_upsampling(info->vertical_scale)   ? STBIR_DEFAULT_FILTER_UPSAMPLE : STBIR_DEFAULT_FILTER_DOWNSAMPLE;
   2239     info->horizontal_filter = h_filter;
   2240     info->vertical_filter = v_filter;
   2241 }
   2242 
   2243 static stbir_uint32 stbir__calculate_memory(stbir__info *info)
   2244 {
   2245     int pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale);
   2246     int filter_height = stbir__get_filter_pixel_width(info->vertical_filter, info->vertical_scale);
   2247 
   2248     info->horizontal_num_contributors = stbir__get_contributors(info->horizontal_scale, info->horizontal_filter, info->input_w, info->output_w);
   2249     info->vertical_num_contributors   = stbir__get_contributors(info->vertical_scale  , info->vertical_filter  , info->input_h, info->output_h);
   2250 
   2251     // One extra entry because floating point precision problems sometimes cause an extra to be necessary.
   2252     info->ring_buffer_num_entries = filter_height + 1;
   2253 
   2254     info->horizontal_contributors_size = info->horizontal_num_contributors * sizeof(stbir__contributors);
   2255     info->horizontal_coefficients_size = stbir__get_total_horizontal_coefficients(info) * sizeof(float);
   2256     info->vertical_contributors_size = info->vertical_num_contributors * sizeof(stbir__contributors);
   2257     info->vertical_coefficients_size = stbir__get_total_vertical_coefficients(info) * sizeof(float);
   2258     info->decode_buffer_size = (info->input_w + pixel_margin * 2) * info->channels * sizeof(float);
   2259     info->horizontal_buffer_size = info->output_w * info->channels * sizeof(float);
   2260     info->ring_buffer_size = info->output_w * info->channels * info->ring_buffer_num_entries * sizeof(float);
   2261     info->encode_buffer_size = info->output_w * info->channels * sizeof(float);
   2262 
   2263     STBIR_ASSERT(info->horizontal_filter != 0);
   2264     STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late
   2265     STBIR_ASSERT(info->vertical_filter != 0);
   2266     STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table)); // this now happens too late
   2267 
   2268     if (stbir__use_height_upsampling(info))
   2269         // The horizontal buffer is for when we're downsampling the height and we
   2270         // can't output the result of sampling the decode buffer directly into the
   2271         // ring buffers.
   2272         info->horizontal_buffer_size = 0;
   2273     else
   2274         // The encode buffer is to retain precision in the height upsampling method
   2275         // and isn't used when height downsampling.
   2276         info->encode_buffer_size = 0;
   2277 
   2278     return info->horizontal_contributors_size + info->horizontal_coefficients_size
   2279         + info->vertical_contributors_size + info->vertical_coefficients_size
   2280         + info->decode_buffer_size + info->horizontal_buffer_size
   2281         + info->ring_buffer_size + info->encode_buffer_size;
   2282 }
   2283 
   2284 static int stbir__resize_allocated(stbir__info *info,
   2285     const void* input_data, int input_stride_in_bytes,
   2286     void* output_data, int output_stride_in_bytes,
   2287     int alpha_channel, stbir_uint32 flags, stbir_datatype type,
   2288     stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace,
   2289     void* tempmem, size_t tempmem_size_in_bytes)
   2290 {
   2291     size_t memory_required = stbir__calculate_memory(info);
   2292 
   2293     int width_stride_input = input_stride_in_bytes ? input_stride_in_bytes : info->channels * info->input_w * stbir__type_size[type];
   2294     int width_stride_output = output_stride_in_bytes ? output_stride_in_bytes : info->channels * info->output_w * stbir__type_size[type];
   2295 
   2296 #ifdef STBIR_DEBUG_OVERWRITE_TEST
   2297 #define OVERWRITE_ARRAY_SIZE 8
   2298     unsigned char overwrite_output_before_pre[OVERWRITE_ARRAY_SIZE];
   2299     unsigned char overwrite_tempmem_before_pre[OVERWRITE_ARRAY_SIZE];
   2300     unsigned char overwrite_output_after_pre[OVERWRITE_ARRAY_SIZE];
   2301     unsigned char overwrite_tempmem_after_pre[OVERWRITE_ARRAY_SIZE];
   2302 
   2303     size_t begin_forbidden = width_stride_output * (info->output_h - 1) + info->output_w * info->channels * stbir__type_size[type];
   2304     memcpy(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE);
   2305     memcpy(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE);
   2306     memcpy(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE);
   2307     memcpy(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE);
   2308 #endif
   2309 
   2310     STBIR_ASSERT(info->channels >= 0);
   2311     STBIR_ASSERT(info->channels <= STBIR_MAX_CHANNELS);
   2312 
   2313     if (info->channels < 0 || info->channels > STBIR_MAX_CHANNELS)
   2314         return 0;
   2315 
   2316     STBIR_ASSERT(info->horizontal_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table));
   2317     STBIR_ASSERT(info->vertical_filter < STBIR__ARRAY_SIZE(stbir__filter_info_table));
   2318 
   2319     if (info->horizontal_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table))
   2320         return 0;
   2321     if (info->vertical_filter >= STBIR__ARRAY_SIZE(stbir__filter_info_table))
   2322         return 0;
   2323 
   2324     if (alpha_channel < 0)
   2325         flags |= STBIR_FLAG_ALPHA_USES_COLORSPACE | STBIR_FLAG_ALPHA_PREMULTIPLIED;
   2326 
   2327     if (!(flags&STBIR_FLAG_ALPHA_USES_COLORSPACE) || !(flags&STBIR_FLAG_ALPHA_PREMULTIPLIED))
   2328         STBIR_ASSERT(alpha_channel >= 0 && alpha_channel < info->channels);
   2329 
   2330     if (alpha_channel >= info->channels)
   2331         return 0;
   2332 
   2333     STBIR_ASSERT(tempmem);
   2334 
   2335     if (!tempmem)
   2336         return 0;
   2337 
   2338     STBIR_ASSERT(tempmem_size_in_bytes >= memory_required);
   2339 
   2340     if (tempmem_size_in_bytes < memory_required)
   2341         return 0;
   2342 
   2343     memset(tempmem, 0, tempmem_size_in_bytes);
   2344 
   2345     info->input_data = input_data;
   2346     info->input_stride_bytes = width_stride_input;
   2347 
   2348     info->output_data = output_data;
   2349     info->output_stride_bytes = width_stride_output;
   2350 
   2351     info->alpha_channel = alpha_channel;
   2352     info->flags = flags;
   2353     info->type = type;
   2354     info->edge_horizontal = edge_horizontal;
   2355     info->edge_vertical = edge_vertical;
   2356     info->colorspace = colorspace;
   2357 
   2358     info->horizontal_coefficient_width   = stbir__get_coefficient_width  (info->horizontal_filter, info->horizontal_scale);
   2359     info->vertical_coefficient_width     = stbir__get_coefficient_width  (info->vertical_filter  , info->vertical_scale  );
   2360     info->horizontal_filter_pixel_width  = stbir__get_filter_pixel_width (info->horizontal_filter, info->horizontal_scale);
   2361     info->vertical_filter_pixel_width    = stbir__get_filter_pixel_width (info->vertical_filter  , info->vertical_scale  );
   2362     info->horizontal_filter_pixel_margin = stbir__get_filter_pixel_margin(info->horizontal_filter, info->horizontal_scale);
   2363     info->vertical_filter_pixel_margin   = stbir__get_filter_pixel_margin(info->vertical_filter  , info->vertical_scale  );
   2364 
   2365     info->ring_buffer_length_bytes = info->output_w * info->channels * sizeof(float);
   2366     info->decode_buffer_pixels = info->input_w + info->horizontal_filter_pixel_margin * 2;
   2367 
   2368 #define STBIR__NEXT_MEMPTR(current, newtype) (newtype*)(((unsigned char*)current) + current##_size)
   2369 
   2370     info->horizontal_contributors = (stbir__contributors *) tempmem;
   2371     info->horizontal_coefficients = STBIR__NEXT_MEMPTR(info->horizontal_contributors, float);
   2372     info->vertical_contributors = STBIR__NEXT_MEMPTR(info->horizontal_coefficients, stbir__contributors);
   2373     info->vertical_coefficients = STBIR__NEXT_MEMPTR(info->vertical_contributors, float);
   2374     info->decode_buffer = STBIR__NEXT_MEMPTR(info->vertical_coefficients, float);
   2375 
   2376     if (stbir__use_height_upsampling(info))
   2377     {
   2378         info->horizontal_buffer = NULL;
   2379         info->ring_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float);
   2380         info->encode_buffer = STBIR__NEXT_MEMPTR(info->ring_buffer, float);
   2381 
   2382         STBIR_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->encode_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes);
   2383     }
   2384     else
   2385     {
   2386         info->horizontal_buffer = STBIR__NEXT_MEMPTR(info->decode_buffer, float);
   2387         info->ring_buffer = STBIR__NEXT_MEMPTR(info->horizontal_buffer, float);
   2388         info->encode_buffer = NULL;
   2389 
   2390         STBIR_ASSERT((size_t)STBIR__NEXT_MEMPTR(info->ring_buffer, unsigned char) == (size_t)tempmem + tempmem_size_in_bytes);
   2391     }
   2392 
   2393 #undef STBIR__NEXT_MEMPTR
   2394 
   2395     // This signals that the ring buffer is empty
   2396     info->ring_buffer_begin_index = -1;
   2397 
   2398     stbir__calculate_filters(info->horizontal_contributors, info->horizontal_coefficients, info->horizontal_filter, info->horizontal_scale, info->horizontal_shift, info->input_w, info->output_w);
   2399     stbir__calculate_filters(info->vertical_contributors, info->vertical_coefficients, info->vertical_filter, info->vertical_scale, info->vertical_shift, info->input_h, info->output_h);
   2400 
   2401     STBIR_PROGRESS_REPORT(0);
   2402 
   2403     if (stbir__use_height_upsampling(info))
   2404         stbir__buffer_loop_upsample(info);
   2405     else
   2406         stbir__buffer_loop_downsample(info);
   2407 
   2408     STBIR_PROGRESS_REPORT(1);
   2409 
   2410 #ifdef STBIR_DEBUG_OVERWRITE_TEST
   2411     STBIR_ASSERT(memcmp(overwrite_output_before_pre, &((unsigned char*)output_data)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0);
   2412     STBIR_ASSERT(memcmp(overwrite_output_after_pre, &((unsigned char*)output_data)[begin_forbidden], OVERWRITE_ARRAY_SIZE) == 0);
   2413     STBIR_ASSERT(memcmp(overwrite_tempmem_before_pre, &((unsigned char*)tempmem)[-OVERWRITE_ARRAY_SIZE], OVERWRITE_ARRAY_SIZE) == 0);
   2414     STBIR_ASSERT(memcmp(overwrite_tempmem_after_pre, &((unsigned char*)tempmem)[tempmem_size_in_bytes], OVERWRITE_ARRAY_SIZE) == 0);
   2415 #endif
   2416 
   2417     return 1;
   2418 }
   2419 
   2420 
   2421 static int stbir__resize_arbitrary(
   2422     void *alloc_context,
   2423     const void* input_data, int input_w, int input_h, int input_stride_in_bytes,
   2424     void* output_data, int output_w, int output_h, int output_stride_in_bytes,
   2425     float s0, float t0, float s1, float t1, float *transform,
   2426     int channels, int alpha_channel, stbir_uint32 flags, stbir_datatype type,
   2427     stbir_filter h_filter, stbir_filter v_filter,
   2428     stbir_edge edge_horizontal, stbir_edge edge_vertical, stbir_colorspace colorspace)
   2429 {
   2430     stbir__info info;
   2431     int result;
   2432     size_t memory_required;
   2433     void* extra_memory;
   2434 
   2435     stbir__setup(&info, input_w, input_h, output_w, output_h, channels);
   2436     stbir__calculate_transform(&info, s0,t0,s1,t1,transform);
   2437     stbir__choose_filter(&info, h_filter, v_filter);
   2438     memory_required = stbir__calculate_memory(&info);
   2439     extra_memory = STBIR_MALLOC(memory_required, alloc_context);
   2440 
   2441     if (!extra_memory)
   2442         return 0;
   2443 
   2444     result = stbir__resize_allocated(&info, input_data, input_stride_in_bytes,
   2445                                             output_data, output_stride_in_bytes, 
   2446                                             alpha_channel, flags, type,
   2447                                             edge_horizontal, edge_vertical,
   2448                                             colorspace, extra_memory, memory_required);
   2449 
   2450     STBIR_FREE(extra_memory, alloc_context);
   2451 
   2452     return result;
   2453 }
   2454 
   2455 STBIRDEF int stbir_resize_uint8(     const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2456                                            unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2457                                      int num_channels)
   2458 {
   2459     return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
   2460         output_pixels, output_w, output_h, output_stride_in_bytes,
   2461         0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
   2462         STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR);
   2463 }
   2464 
   2465 STBIRDEF int stbir_resize_float(     const float *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2466                                            float *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2467                                      int num_channels)
   2468 {
   2469     return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
   2470         output_pixels, output_w, output_h, output_stride_in_bytes,
   2471         0,0,1,1,NULL,num_channels,-1,0, STBIR_TYPE_FLOAT, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
   2472         STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_LINEAR);
   2473 }
   2474 
   2475 STBIRDEF int stbir_resize_uint8_srgb(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2476                                            unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2477                                      int num_channels, int alpha_channel, int flags)
   2478 {
   2479     return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
   2480         output_pixels, output_w, output_h, output_stride_in_bytes,
   2481         0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
   2482         STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP, STBIR_COLORSPACE_SRGB);
   2483 }
   2484 
   2485 STBIRDEF int stbir_resize_uint8_srgb_edgemode(const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2486                                                     unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2487                                               int num_channels, int alpha_channel, int flags,
   2488                                               stbir_edge edge_wrap_mode)
   2489 {
   2490     return stbir__resize_arbitrary(NULL, input_pixels, input_w, input_h, input_stride_in_bytes,
   2491         output_pixels, output_w, output_h, output_stride_in_bytes,
   2492         0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, STBIR_FILTER_DEFAULT, STBIR_FILTER_DEFAULT,
   2493         edge_wrap_mode, edge_wrap_mode, STBIR_COLORSPACE_SRGB);
   2494 }
   2495 
   2496 STBIRDEF int stbir_resize_uint8_generic( const unsigned char *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2497                                                unsigned char *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2498                                          int num_channels, int alpha_channel, int flags,
   2499                                          stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
   2500                                          void *alloc_context)
   2501 {
   2502     return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
   2503         output_pixels, output_w, output_h, output_stride_in_bytes,
   2504         0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT8, filter, filter,
   2505         edge_wrap_mode, edge_wrap_mode, space);
   2506 }
   2507 
   2508 STBIRDEF int stbir_resize_uint16_generic(const stbir_uint16 *input_pixels  , int input_w , int input_h , int input_stride_in_bytes,
   2509                                                stbir_uint16 *output_pixels , int output_w, int output_h, int output_stride_in_bytes,
   2510                                          int num_channels, int alpha_channel, int flags,
   2511                                          stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
   2512                                          void *alloc_context)
   2513 {
   2514     return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
   2515         output_pixels, output_w, output_h, output_stride_in_bytes,
   2516         0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_UINT16, filter, filter,
   2517         edge_wrap_mode, edge_wrap_mode, space);
   2518 }
   2519 
   2520 
   2521 STBIRDEF int stbir_resize_float_generic( const float *input_pixels         , int input_w , int input_h , int input_stride_in_bytes,
   2522                                                float *output_pixels        , int output_w, int output_h, int output_stride_in_bytes,
   2523                                          int num_channels, int alpha_channel, int flags,
   2524                                          stbir_edge edge_wrap_mode, stbir_filter filter, stbir_colorspace space, 
   2525                                          void *alloc_context)
   2526 {
   2527     return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
   2528         output_pixels, output_w, output_h, output_stride_in_bytes,
   2529         0,0,1,1,NULL,num_channels,alpha_channel,flags, STBIR_TYPE_FLOAT, filter, filter,
   2530         edge_wrap_mode, edge_wrap_mode, space);
   2531 }
   2532 
   2533 
   2534 STBIRDEF int stbir_resize(         const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2535                                          void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2536                                    stbir_datatype datatype,
   2537                                    int num_channels, int alpha_channel, int flags,
   2538                                    stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
   2539                                    stbir_filter filter_horizontal,  stbir_filter filter_vertical,
   2540                                    stbir_colorspace space, void *alloc_context)
   2541 {
   2542     return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
   2543         output_pixels, output_w, output_h, output_stride_in_bytes,
   2544         0,0,1,1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical,
   2545         edge_mode_horizontal, edge_mode_vertical, space);
   2546 }
   2547 
   2548 
   2549 STBIRDEF int stbir_resize_subpixel(const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2550                                          void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2551                                    stbir_datatype datatype,
   2552                                    int num_channels, int alpha_channel, int flags,
   2553                                    stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
   2554                                    stbir_filter filter_horizontal,  stbir_filter filter_vertical,
   2555                                    stbir_colorspace space, void *alloc_context,
   2556                                    float x_scale, float y_scale,
   2557                                    float x_offset, float y_offset)
   2558 {
   2559     float transform[4];
   2560     transform[0] = x_scale;
   2561     transform[1] = y_scale;
   2562     transform[2] = x_offset;
   2563     transform[3] = y_offset;
   2564     return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
   2565         output_pixels, output_w, output_h, output_stride_in_bytes,
   2566         0,0,1,1,transform,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical,
   2567         edge_mode_horizontal, edge_mode_vertical, space);
   2568 }
   2569 
   2570 STBIRDEF int stbir_resize_region(  const void *input_pixels , int input_w , int input_h , int input_stride_in_bytes,
   2571                                          void *output_pixels, int output_w, int output_h, int output_stride_in_bytes,
   2572                                    stbir_datatype datatype,
   2573                                    int num_channels, int alpha_channel, int flags,
   2574                                    stbir_edge edge_mode_horizontal, stbir_edge edge_mode_vertical, 
   2575                                    stbir_filter filter_horizontal,  stbir_filter filter_vertical,
   2576                                    stbir_colorspace space, void *alloc_context,
   2577                                    float s0, float t0, float s1, float t1)
   2578 {
   2579     return stbir__resize_arbitrary(alloc_context, input_pixels, input_w, input_h, input_stride_in_bytes,
   2580         output_pixels, output_w, output_h, output_stride_in_bytes,
   2581         s0,t0,s1,t1,NULL,num_channels,alpha_channel,flags, datatype, filter_horizontal, filter_vertical,
   2582         edge_mode_horizontal, edge_mode_vertical, space);
   2583 }
   2584 
   2585 #endif // STB_IMAGE_RESIZE_IMPLEMENTATION
   2586 
   2587 /*
   2588 ------------------------------------------------------------------------------
   2589 This software is available under 2 licenses -- choose whichever you prefer.
   2590 ------------------------------------------------------------------------------
   2591 ALTERNATIVE A - MIT License
   2592 Copyright (c) 2017 Sean Barrett
   2593 Permission is hereby granted, free of charge, to any person obtaining a copy of 
   2594 this software and associated documentation files (the "Software"), to deal in 
   2595 the Software without restriction, including without limitation the rights to 
   2596 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 
   2597 of the Software, and to permit persons to whom the Software is furnished to do 
   2598 so, subject to the following conditions:
   2599 The above copyright notice and this permission notice shall be included in all 
   2600 copies or substantial portions of the Software.
   2601 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
   2602 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
   2603 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 
   2604 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 
   2605 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 
   2606 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
   2607 SOFTWARE.
   2608 ------------------------------------------------------------------------------
   2609 ALTERNATIVE B - Public Domain (www.unlicense.org)
   2610 This is free and unencumbered software released into the public domain.
   2611 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this 
   2612 software, either in source code form or as a compiled binary, for any purpose, 
   2613 commercial or non-commercial, and by any means.
   2614 In jurisdictions that recognize copyright laws, the author or authors of this 
   2615 software dedicate any and all copyright interest in the software to the public 
   2616 domain. We make this dedication for the benefit of the public at large and to 
   2617 the detriment of our heirs and successors. We intend this dedication to be an 
   2618 overt act of relinquishment in perpetuity of all present and future rights to 
   2619 this software under copyright law.
   2620 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 
   2621 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 
   2622 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 
   2623 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 
   2624 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 
   2625 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
   2626 ------------------------------------------------------------------------------
   2627 */