Commits (8)
  • Werner Lemberg's avatar
    Small clean-ups for the last few commits. · 205d1ae4
    Werner Lemberg authored
    * include/freetype/fttrace.h (afwarp): Removed.
    205d1ae4
  • Werner Lemberg's avatar
    [autofit] More clean-ups. · 825b7ea2
    Werner Lemberg authored
    * src/autofit/afhints.h (AF_GlyphHintsRec): Remove the no longer
    needed fields `xmin_delta` and `xmax_delta`.
    
    * src/autofit/afhints.c (af_glyph_hints_reload),
    src/autofit/afloader.c (af_loader_load_glyph): Updated.
    825b7ea2
  • Alexander Richardson's avatar
    * meson.build: Fix build for other UNIX systems (e.g., FreeBSD). · c5516e0f
    Alexander Richardson authored
    Without this change the build of `unix/ftsystem.c` fails because the
    `ftconfig.h` header that defines macros such as `HAVE_UNISTD_H` and
    `HAVE_FCNTL_H` is only being generated for Linux, macOS, and Cygwin
    systems:
    
    ```
    .../builds/unix/ftsystem.c:258:32: error:
        use of undeclared identifier 'O_RDONLY'
    file = open( filepathname, O_RDONLY );
    ```
    
    Instead of hardcoding a list of operating systems for this check,
    update the logic that decides whether to build the file and set a
    boolean flag that can be checked instead.
    c5516e0f
  • Anuj Verma's avatar
    [sdf] Improve documentation. · e592982a
    Anuj Verma authored
    e592982a
  • Oleg Oshmyan's avatar
    [base] Reject combinations of incompatible `FT_OPEN_XXX` flags. · a4c8f21a
    Oleg Oshmyan authored
    The three modes are mutually exclusive, and the documentation of the
    `FT_OPEN_XXX` constants notes this.  However, there was no check to
    validate this in the code, and the documentation on `FT_Open_Args`
    claimed that the corresponding bits were checked in a well-defined
    order, implying it was valid (if useless) to specify more than one.
    Ironically, this documented order did not agree with the actual
    code, so it could not be relied upon; hopefully, nobody did this and
    nobody will be hurt by the new validation.
    
    Even if multiple mode bits were allowed, they could cause memory
    leaks: if both `FT_OPEN_STREAM` and `stream` are set along with
    either `FT_OPEN_MEMORY` or `FT_OPEN_PATHNAME`, then `FT_Stream_New`
    allocated a new stream but `FT_Open_Face` marked it as an 'external'
    stream, so the stream object was never released.
    
    * src/base/ftobjs.c (FT_Stream_New): Reject incompatible
    `FT_OPEN_XXX` flags.
    a4c8f21a
  • Oleg Oshmyan's avatar
    [base] Fix `FT_Open_Face`'s handling of user-supplied streams. · 5d27b10f
    Oleg Oshmyan authored
    This was already true (though undocumented) most of the time, but
    not if `FT_NEW` inside `FT_Stream_New` failed or if the
    `FT_OPEN_XXX` flags were bad.
    
    Normally, `FT_Open_Face` calls `FT_Stream_New`, which returns the
    user-supplied stream unchanged, and in case of any subsequent error
    in `FT_Open_Face`, the stream is closed via `FT_Stream_Free`.
    
    Up to now, however, `FT_Stream_New` allocates a new stream even if
    it is already given one by the user.  If this allocation fails, the
    user-supplied stream is not returned to `FT_Open_Face` and never
    closed.  Moreover, the user cannot detect this situation: all they
    see is that `FT_Open_Face` returns `FT_Err_Out_Of_Memory`, but that
    can also happen after a different allocation fails within the main
    body of `FT_Open_Face`, when the user's stream has already been
    closed by `FT_Open_Face`.  It is plausible that the user stream's
    `close` method frees memory allocated for the stream object itself,
    so the user cannot defensively free it upon `FT_Open_Face` failure
    lest it ends up doubly freed.  All in all, this ends up leaking the
    memory/resources used by user's stream.
    
    Furthermore, `FT_Stream_New` simply returns an error if the
    `FT_OPEN_XXX` flags are unsupported, which can mean either an
    invalid combination of flags or a perfectly innocent
    `FT_OPEN_STREAM` on a FreeType build that lacks stream support.
    With this patch, the user-supplied stream is closed even in these
    cases, so the user can be sure that if `FT_Open_Face` failed, the
    stream is definitely closed.
    
    * src/base/ftobjs.c (FT_Stream_New): Don't allocate a buffer
    unnecessarily.
    Move error-handling code to make the control flow more obvious.
    Close user-supplied stream if the flags are unsupported.
    `FT_Stream_Open` always sets `pathname.pointer`, so remove the
    redundant (re)assignment.  None of the `FT_Stream_Open...` functions
    uses `stream->memory`, so keep just one assignment at the end,
    shared among all possible control flow paths.
    ('Unsupported flags' that may need a stream closure can be either an
    invalid combination of multiple `FT_OPEN_XXX` mode flags or a clean
    `FT_OPEN_STREAM` flag on a FreeType build that lacks stream
    support.)
    5d27b10f
  • David Turner's avatar
    [smooth] Minor speedup to smooth rasterizer · 7f0cb0cb
    David Turner authored
    This speeds up the smooth rasterizer by avoiding a
    conditional branches in the hot path. Namely:
    
    - Define a fixed "null cell" which will be pointed
      to whenever the current cell is outside of the current
      target region. This avoids a "ras.cell != NULL"
      check in the FT_INTEGRATE() macro.
    
    - Also use the null cell as a sentinel at the end of
      all ycells[] linked-lists, by setting its x coordinate
      to INT_MAX. This avoids a 'if (!cell)' check in
      gray_set_cell() as well.
    
    - Slightly change the worker struct fields to perform
      a little less operations during rendering.
    
    Example results (on a 2013 Corei5-3337U CPU)
    
      out/ftbench -p -s10 -t5 -bc /usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf
    
        Before: 5.472 us/op
        After:  5.275 us/op
    
      out/ftbench -p -s60 -t5 -bc /usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf
    
        Before: 17.988 us/op
        After:  17.389 us/op
    7f0cb0cb
  • David Turner's avatar
    [smooth] Implement Bezier quadratic arc flattenning with DDA · da116fc4
    David Turner authored
    Benchmarking shows that this provides a very slighty performance
    boost when rendering fonts with lots of quadratic bezier arcs,
    compared to the recursive arc splitting, but only when SSE2 is
    available, or on 64-bit CPUs.
    
    On a 2017 Core i5-7300U CPU on Linux/x86_64:
    
      ./ftbench -p -s10 -t5 -cb .../DroidSansFallbackFull.ttf
    
        Before: 4.033 us/op  (best of 5 runs for all numbers)
        After:  3.876 us/op
    
      ./ftbench -p -s60 -t5 -cb .../DroidSansFallbackFull.ttf
    
        Before: 13.467 us/op
        After:  13.385 us/op
    da116fc4
2021-07-13 David Turner <david@freetype.org>
[smooth] Implement Bezier quadratic arc flattenning with DDA
Benchmarking shows that this provides a very slighty performance
boost when rendering fonts with lots of quadratic bezier arcs,
compared to the recursive arc splitting, but only when SSE2 is
available, or on 64-bit CPUs.
* src/smooth/ftgrays.c (gray_render_conic): New implementation
based on DDA and optionally SSE2.
2021-07-13 David Turner <david@freetype.org>
[smooth] Minor speedup to smooth rasterizer
This speeds up the smooth rasterizer by avoiding a conditional
branches in the hot path.
* src/smooth/ftgrays.c: Define a null cell used to both as a
sentinel for all linked-lists, and to accumulate coverage and
area values for "out-of-bounds" cell positions without a
conditional check.
2021-07-13 Oleg Oshmyan <chortos@inbox.lv>
[base] Fix `FT_Open_Face`'s handling of user-supplied streams.
This was already true (though undocumented) most of the time, but
not if `FT_NEW` inside `FT_Stream_New` failed or if the
`FT_OPEN_XXX` flags were bad.
Normally, `FT_Open_Face` calls `FT_Stream_New`, which returns the
user-supplied stream unchanged, and in case of any subsequent error
in `FT_Open_Face`, the stream is closed via `FT_Stream_Free`.
Up to now, however, `FT_Stream_New` allocates a new stream even if
it is already given one by the user. If this allocation fails, the
user-supplied stream is not returned to `FT_Open_Face` and never
closed. Moreover, the user cannot detect this situation: all they
see is that `FT_Open_Face` returns `FT_Err_Out_Of_Memory`, but that
can also happen after a different allocation fails within the main
body of `FT_Open_Face`, when the user's stream has already been
closed by `FT_Open_Face`. It is plausible that the user stream's
`close` method frees memory allocated for the stream object itself,
so the user cannot defensively free it upon `FT_Open_Face` failure
lest it ends up doubly freed. All in all, this ends up leaking the
memory/resources used by user's stream.
Furthermore, `FT_Stream_New` simply returns an error if the
`FT_OPEN_XXX` flags are unsupported, which can mean either an
invalid combination of flags or a perfectly innocent
`FT_OPEN_STREAM` on a FreeType build that lacks stream support.
With this patch, the user-supplied stream is closed even in these
cases, so the user can be sure that if `FT_Open_Face` failed, the
stream is definitely closed.
* src/base/ftobjs.c (FT_Stream_New): Don't allocate a buffer
unnecessarily.
Move error-handling code to make the control flow more obvious.
Close user-supplied stream if the flags are unsupported.
`FT_Stream_Open` always sets `pathname.pointer`, so remove the
redundant (re)assignment. None of the `FT_Stream_Open...` functions
uses `stream->memory`, so keep just one assignment at the end,
shared among all possible control flow paths.
('Unsupported flags' that may need a stream closure can be either an
invalid combination of multiple `FT_OPEN_XXX` mode flags or a clean
`FT_OPEN_STREAM` flag on a FreeType build that lacks stream
support.)
2021-07-13 Oleg Oshmyan <chortos@inbox.lv>
[base] Reject combinations of incompatible `FT_OPEN_XXX` flags.
The three modes are mutually exclusive, and the documentation of the
`FT_OPEN_XXX` constants notes this. However, there was no check to
validate this in the code, and the documentation on `FT_Open_Args`
claimed that the corresponding bits were checked in a well-defined
order, implying it was valid (if useless) to specify more than one.
Ironically, this documented order did not agree with the actual
code, so it could not be relied upon; hopefully, nobody did this and
nobody will be hurt by the new validation.
Even if multiple mode bits were allowed, they could cause memory
leaks: if both `FT_OPEN_STREAM` and `stream` are set along with
either `FT_OPEN_MEMORY` or `FT_OPEN_PATHNAME`, then `FT_Stream_New`
allocated a new stream but `FT_Open_Face` marked it as an 'external'
stream, so the stream object was never released.
* src/base/ftobjs.c (FT_Stream_New): Reject incompatible
`FT_OPEN_XXX` flags.
2021-07-12 Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
* meson.build: Fix build for other UNIX systems (e.g., FreeBSD).
Without this change the build of `unix/ftsystem.c` fails because the
`ftconfig.h` header that defines macros such as `HAVE_UNISTD_H` and
`HAVE_FCNTL_H` is only being generated for Linux, macOS, and Cygwin
systems:
```
.../builds/unix/ftsystem.c:258:32: error:
use of undeclared identifier 'O_RDONLY'
file = open( filepathname, O_RDONLY );
```
Instead of hardcoding a list of operating systems for this check,
update the logic that decides whether to build the file and set a
boolean flag that can be checked instead.
2021-07-12 Werner Lemberg <wl@gnu.org>
[autofit] More clean-ups.
* src/autofit/afhints.h (AF_GlyphHintsRec): Remove the no longer
needed fields `xmin_delta` and `xmax_delta`.
* src/autofit/afhints.c (af_glyph_hints_reload),
src/autofit/afloader.c (af_loader_load_glyph): Updated.
2021-07-12 Werner Lemberg <wl@gnu.org>
Small clean-ups for the last few commits.
* include/freetype/fttrace.h (afwarp): Removed.
2021-07-12 David Turner <david@freetype.org>
Remove obsolete AF_Angle type and related sources.
Remove obsolete `AF_Angle` type and related sources.
Move the af_sort_xxx() functions from afangles.c to afhints.c in
order to get rid of the obsolete angle-related types, macros and
function definitions.
* src/autofit/afangles.c: File removed. Functions related to
sorting moved to...
* src/autofit/afhints.c (af_sort_pos, af_sort_and_quantize_widths):
This file.
* src/autofit/afangles.h: File removed.
* src/autofit/aftypes.h: Updated.
* src/autofit/autofit.c: Updated.
* src/autofit/*: Remove code.
* src/autofit/rules.mk (AUTOF_DRV_SRC): Updated.
2021-07-12 David Turner <david@freetype.org>
Remove experimental auto-hinting 'warp' mode.
This feature was always experimental, and probably nevery worked
properly. This patch completely removes it from the source code,
This feature was always experimental, and probably never worked
properly. This patch completely removes it from the source code,
except for a documentation block describing it for historical
purpose.
purposes.
* devel/ftoption.h, include/freetype/config/ftoption.h: Remove
`AF_CONFIG_OPTION_USE_WARPER`.
* include/freetype/ftdriver.h: Document 'warping' property as
obsolete.
* devel/ftoption.h: Remove AF_CONFIG_OPTION_USE_WARPER.
* include/freetype/config/ftoption.h: Remove AF_CONFIG_OPTION_USE_WARPER.
* include/freetype/ftdriver.h: Document 'warping' property as obsolete.
* src/autofit/*: Remove any warp mode related code.
* src/autofit/afwarp.c, src/autofit/afwarp.h: Files removed.
* src/autofit/*: Remove any code related to warp mode.
2021-07-12 David Turner <david@freetype.org>
Remove experimental "Latin2" writing system (FT_OPTION_AUTOFIT2)
Remove experimental 'Latin2' writing system (`FT_OPTION_AUTOFIT2`).
This code has always been experimental and was never compiled anyway
(FT_OPTION_AUTOFIT2 does not appear in ftoption.h or even any of our
build files).
* include/freetype/internal/fttrace.h: Remove 'FT_TRACE_DEF( aflatin2 )'.
* src/autofit/aflatin2.[hc]: Removed.
* src/autofit/afloader.c: Remove undocumented hook to activate Latin2 system.
* src/autofit/afstyles.h: Remove ltn2_dflt style definition.
* src/autofit/afwrtsys.h: Remove LATIN2 writing system definition.
(`FT_OPTION_AUTOFIT2` does not appear in `ftoption.h` or even any of
our build files).
* include/freetype/internal/fttrace.h (aflatin2): Removed.
* src/autofit/aflatin2.h, src/autofit/aflatin2.c: Files removed.
* src/autofit/afloader.c: Remove undocumented hook to activate
Latin2 system.
* src/autofit/afstyles.h: Remove `ltn2_dflt` style definition.
* src/autofit/afwrtsys.h: Remove `LATIN2` writing system definition.
* src/autofit/autofit.c: Updated.
2021-07-05 Werner Lemberg <wl@gnu.org>
......
......@@ -105,8 +105,7 @@ FT_BEGIN_HEADER
*
* ```
* FREETYPE_PROPERTIES=truetype:interpreter-version=35 \
* cff:no-stem-darkening=1 \
* autofitter:warping=1
* cff:no-stem-darkening=1
* ```
*
*/
......
......@@ -72,6 +72,9 @@ CHANGES BETWEEN 2.10.4 and 2.11.0
This work was Priyesh Kumar's GSoC 2020 project.
- The experimental 'warp' mode (AF_CONFIG_OPTION_USE_WARPER) for the
auto-hinter has been removed.
- The smooth rasterizer performance has been improved by >10%.
- PCF bitmap fonts compressed with LZW (these are usually files with
......
......@@ -105,8 +105,7 @@ FT_BEGIN_HEADER
*
* ```
* FREETYPE_PROPERTIES=truetype:interpreter-version=35 \
* cff:no-stem-darkening=1 \
* autofitter:warping=1
* cff:no-stem-darkening=1
* ```
*
*/
......
......@@ -2113,8 +2113,7 @@ FT_BEGIN_HEADER
* Extra parameters passed to the font driver when opening a new face.
*
* @note:
* The stream type is determined by the contents of `flags` that are
* tested in the following order by @FT_Open_Face:
* The stream type is determined by the contents of `flags`:
*
* If the @FT_OPEN_MEMORY bit is set, assume that this is a memory file
* of `memory_size` bytes, located at `memory_address`. The data are not
......@@ -2127,6 +2126,9 @@ FT_BEGIN_HEADER
* Otherwise, if the @FT_OPEN_PATHNAME bit is set, assume that this is a
* normal file and use `pathname` to open it.
*
* If none of the above bits are set or if multiple are set at the same
* time, the flags are invalid and @FT_Open_Face fails.
*
* If the @FT_OPEN_DRIVER bit is set, @FT_Open_Face only tries to open
* the file with the driver whose handler is in `driver`.
*
......@@ -2299,6 +2301,10 @@ FT_BEGIN_HEADER
* See the discussion of reference counters in the description of
* @FT_Reference_Face.
*
* If `FT_OPEN_STREAM` is set in `args->flags`, the stream in
* `args->stream` is automatically closed before this function returns
* any error (including `FT_Err_Invalid_Argument`).
*
* @example:
* To loop over all faces, use code similar to the following snippet
* (omitting the error handling).
......@@ -3307,13 +3313,13 @@ FT_BEGIN_HEADER
* pixels and use the @FT_PIXEL_MODE_LCD_V mode.
*
* FT_RENDER_MODE_SDF ::
* This mode corresponds to 8-bit signed distance fields (SDF)
* bitmaps. Each pixel in a SDF bitmap contains information about the
* nearest edge of the glyph outline. The distances are calculated
* from the center of the pixel and are positive if they are filled by
* the outline (i.e., inside the outline) and negative otherwise.
* Check the note below on how to convert the output values to usable
* data.
* This mode corresponds to 8-bit, single-channel signed distance field
* (SDF) bitmaps. Each pixel in the SDF grid is the value from the
* pixel's position to the nearest glyph's outline. The distances are
* calculated from the center of the pixel and are positive if they are
* filled by the outline (i.e., inside the outline) and negative
* otherwise. Check the note below on how to convert the output values
* to usable data.
*
* @note:
* The selected render mode only affects vector glyphs of a font.
......
......@@ -53,10 +53,10 @@ FT_BEGIN_HEADER
* reasons.
*
* Available properties are @increase-x-height, @no-stem-darkening
* (experimental), @darkening-parameters (experimental), @warping
* (experimental), @glyph-to-script-map (experimental), @fallback-script
* (experimental), and @default-script (experimental), as documented in
* the @properties section.
* (experimental), @darkening-parameters (experimental),
* @glyph-to-script-map (experimental), @fallback-script (experimental),
* and @default-script (experimental), as documented in the @properties
* section.
*
*/
......@@ -1165,15 +1165,15 @@ FT_BEGIN_HEADER
* **Obsolete**
*
* This property was always experimental and probably never worked
* correctly. It was entirely removed from the FreeType 2 sources.
* This entry is only here for historical reference.
* correctly. It was entirely removed from the FreeType~2 sources. This
* entry is only here for historical reference.
*
* Warping only works in 'normal' auto-hinting mode replacing it. The
* idea of the code is to slightly scale and shift a glyph along the
* Warping only worked in 'normal' auto-hinting mode replacing it. The
* idea of the code was to slightly scale and shift a glyph along the
* non-hinted dimension (which is usually the horizontal axis) so that as
* much of its segments are aligned (more or less) to the grid. To find
* much of its segments were aligned (more or less) to the grid. To find
* out a glyph's optimal scaling and shifting value, various parameter
* combinations are tried and scored.
* combinations were tried and scored.
*
* @since:
* 2.6
......
......@@ -508,8 +508,7 @@ FT_BEGIN_HEADER
*
* ```
* FREETYPE_PROPERTIES=truetype:interpreter-version=35 \
* cff:no-stem-darkening=0 \
* autofitter:warping=1
* cff:no-stem-darkening=0
* ```
*
* @inout:
......
......@@ -160,7 +160,6 @@ FT_TRACE_DEF( afhints )
FT_TRACE_DEF( afmodule )
FT_TRACE_DEF( aflatin )
FT_TRACE_DEF( afshaper )
FT_TRACE_DEF( afwarp )
/* SDF components */
FT_TRACE_DEF( sdf ) /* signed distance raster for outlines (ftsdf.c) */
......
......@@ -193,6 +193,7 @@ has_sys_mman_h = cc.has_header('sys/mman.h')
mmap_option = get_option('mmap')
use_unix_ftsystem_c = false
if mmap_option.disabled()
ft2_sources += files(['src/base/ftsystem.c',])
elif host_machine.system() == 'windows'
......@@ -201,6 +202,7 @@ else
if has_unistd_h and has_fcntl_h and has_sys_mman_h
# This version of `ftsystem.c` uses `mmap` to read input font files.
ft2_sources += files(['builds/unix/ftsystem.c',])
use_unix_ftsystem_c = true
elif mmap_option.enabled()
error('mmap was enabled via options but is not available,'
+ ' required headers were not found!')
......@@ -321,7 +323,7 @@ if has_fcntl_h
ftconfig_command += '--enable=HAVE_FCNTL_H'
endif
if host_machine.system() in ['linux', 'darwin', 'cygwin']
if use_unix_ftsystem_c
ftconfig_h_in = files('builds/unix/ftconfig.h.in')
ftconfig_h = custom_target('ftconfig.h',
input: ftconfig_h_in,
......
......@@ -953,9 +953,6 @@
hints->x_delta = x_delta;
hints->y_delta = y_delta;
hints->xmin_delta = 0;
hints->xmax_delta = 0;
points = hints->points;
if ( hints->num_points == 0 )
goto Exit;
......
......@@ -362,9 +362,6 @@ FT_BEGIN_HEADER
/* implementations */
AF_StyleMetrics metrics;
FT_Pos xmin_delta; /* used for warping */
FT_Pos xmax_delta;
/* Two arrays to avoid allocation penalty. */
/* The `embedded' structure must be the last element! */
struct
......
......@@ -473,8 +473,8 @@
FT_Pos pp2x = loader->pp2.x;
loader->pp1.x = FT_PIX_ROUND( pp1x + hints->xmin_delta );
loader->pp2.x = FT_PIX_ROUND( pp2x + hints->xmax_delta );
loader->pp1.x = FT_PIX_ROUND( pp1x );
loader->pp2.x = FT_PIX_ROUND( pp2x );
slot->lsb_delta = loader->pp1.x - pp1x;
slot->rsb_delta = loader->pp2.x - pp2x;
......
......@@ -197,6 +197,7 @@
FT_Error error;
FT_Memory memory;
FT_Stream stream = NULL;
FT_UInt mode;
*astream = NULL;
......@@ -208,15 +209,15 @@
return FT_THROW( Invalid_Argument );
memory = library->memory;
mode = args->flags &
( FT_OPEN_MEMORY | FT_OPEN_STREAM | FT_OPEN_PATHNAME );
if ( FT_NEW( stream ) )
goto Exit;
stream->memory = memory;
if ( args->flags & FT_OPEN_MEMORY )
if ( mode == FT_OPEN_MEMORY )
{
/* create a memory-based stream */
if ( FT_NEW( stream ) )
goto Exit;
FT_Stream_OpenMemory( stream,
(const FT_Byte*)args->memory_base,
(FT_ULong)args->memory_size );
......@@ -224,33 +225,40 @@
#ifndef FT_CONFIG_OPTION_DISABLE_STREAM_SUPPORT
else if ( args->flags & FT_OPEN_PATHNAME )
else if ( mode == FT_OPEN_PATHNAME )
{
/* create a normal system stream */
if ( FT_NEW( stream ) )
goto Exit;
error = FT_Stream_Open( stream, args->pathname );
stream->pathname.pointer = args->pathname;
if ( error )
FT_FREE( stream );
}
else if ( ( args->flags & FT_OPEN_STREAM ) && args->stream )
else if ( ( mode == FT_OPEN_STREAM ) && args->stream )
{
/* use an existing, user-provided stream */
/* in this case, we do not need to allocate a new stream object */
/* since the caller is responsible for closing it himself */
FT_FREE( stream );
stream = args->stream;
error = FT_Err_Ok;
}
#endif
else
{
error = FT_THROW( Invalid_Argument );
if ( ( args->flags & FT_OPEN_STREAM ) && args->stream )
FT_Stream_Close( args->stream );
}
if ( error )
FT_FREE( stream );
else
stream->memory = memory; /* just to be certain */
*astream = stream;
if ( !error )
{
stream->memory = memory;
*astream = stream;
}
Exit:
return error;
......
......@@ -41,7 +41,8 @@
* file `ftbsdf.c` for more.
*
* * The basic idea of generating the SDF is taken from Viktor Chlumsky's
* research paper.
* research paper. The paper explains both single and multi-channel
* SDF, however, this implementation only generates single-channel SDF.
*
* Chlumsky, Viktor: Shape Decomposition for Multi-channel Distance
* Fields. Master's thesis. Czech Technical University in Prague,
......
......@@ -479,19 +479,24 @@ typedef ptrdiff_t FT_PtrDist;
{
ft_jmp_buf jump_buffer;
TCoord min_ex, max_ex;
TCoord min_ex, max_ex; /* min and max integer pixel coordinates */
TCoord min_ey, max_ey;
TCoord count_ey; /* same as (max_ey - min_ey) */
PCell cell;
PCell* ycells;
PCell cells;
FT_PtrDist max_cells;
FT_PtrDist num_cells;
PCell cell; /* current cell */
PCell cell_free; /* call allocation next free slot */
PCell cell_limit; /* cell allocation limit */
TPos x, y;
PCell* ycells; /* array of cell linked-lists, one per */
/* vertical coordinate in the current band. */
FT_Outline outline;
TPixmap target;
PCell cells; /* cell storage area */
FT_PtrDist max_cells; /* cell storage capacity */
TPos x, y; /* last point position */
FT_Outline outline; /* input outline */
TPixmap target; /* target pixmap */
FT_Raster_Span_Func render_span;
void* render_span_data;
......@@ -502,21 +507,34 @@ typedef ptrdiff_t FT_PtrDist;
#pragma warning( pop )
#endif
#ifndef FT_STATIC_RASTER
#define ras (*worker)
#else
static gray_TWorker ras;
#endif
#define FT_INTEGRATE( ras, a, b ) \
if ( ras.cell ) \
ras.cell->cover += (a), ras.cell->area += (a) * (TArea)(b)
/* Return a pointer to the "null cell", used as a sentinel at the end */
/* of all ycells[] linked lists. Its x coordinate should be maximal */
/* to ensure no NULL checks are necessary when looking for an insertion */
/* point in gray_set_cell(). Other loops should check the cell pointer */
/* with CELL_IS_NULL() to detect the end of the list. */
#define NULL_CELL_PTR(ras) (ras).cells
/* The |x| value of the null cell. Must be the largest possible */
/* integer value stored in a TCell.x field. */
#define CELL_MAX_X_VALUE INT_MAX
/* Return true iff |cell| points to the null cell. */
#define CELL_IS_NULL(cell) ((cell)->x == CELL_MAX_X_VALUE)
#define FT_INTEGRATE( ras, a, b ) \
ras.cell->cover += (a), ras.cell->area += (a) * (TArea)(b)
typedef struct gray_TRaster_
{
void* memory;
void* memory;
} gray_TRaster, *gray_PRaster;
......@@ -538,7 +556,7 @@ typedef ptrdiff_t FT_PtrDist;
printf( "%3d:", y );
for ( ; cell != NULL; cell = cell->next )
for ( ; !CELL_IS_NULL(cell); cell = cell->next )
printf( " (%3d, c:%4d, a:%6d)",
cell->x, cell->cover, cell->area );
printf( "\n" );
......@@ -566,11 +584,12 @@ typedef ptrdiff_t FT_PtrDist;
/* Note that if a cell is to the left of the clipping region, it is */
/* actually set to the (min_ex-1) horizontal position. */
if ( ey >= ras.max_ey || ey < ras.min_ey || ex >= ras.max_ex )
ras.cell = NULL;
TCoord ey_index = ey - ras.min_ey;
if ( ey_index < 0 || ey_index >= ras.count_ey || ex >= ras.max_ex )
ras.cell = NULL_CELL_PTR(ras);
else
{
PCell* pcell = ras.ycells + ey - ras.min_ey;
PCell* pcell = ras.ycells + ey_index;
PCell cell;
......@@ -580,7 +599,7 @@ typedef ptrdiff_t FT_PtrDist;
{
cell = *pcell;
if ( !cell || cell->x > ex )
if ( cell->x > ex )
break;
if ( cell->x == ex )
......@@ -589,11 +608,11 @@ typedef ptrdiff_t FT_PtrDist;
pcell = &cell->next;
}
if ( ras.num_cells >= ras.max_cells )
/* insert new cell */
cell = ras.cell_free++;
if (cell >= ras.cell_limit)
ft_longjmp( ras.jump_buffer, 1 );
/* insert new cell */
cell = ras.cells + ras.num_cells++;
cell->x = ex;
cell->area = 0;
cell->cover = 0;
......@@ -974,6 +993,188 @@ typedef ptrdiff_t FT_PtrDist;
#endif
/* Benchmarking shows that using DDA to flatten the quadratic bezier
* arcs is slightly faster in the following cases:
*
* - When the host CPU is 64-bit.
* - When SSE2 SIMD registers and instructions are available (even on x86).
*
* For other cases, using binary splits is actually slightly faster.
*/
#if defined(__SSE2__) || defined(__x86_64__) || defined(__aarch64__) || defined(_M_AMD64) || defined(_M_ARM64)
#define BEZIER_USE_DDA 1
#else
#define BEZIER_USE_DDA 0
#endif
#if BEZIER_USE_DDA
#include <emmintrin.h>
static void
gray_render_conic( RAS_ARG_ const FT_Vector* control,
const FT_Vector* to )
{
FT_Vector p0, p1, p2;
p0.x = ras.x;
p0.y = ras.y;
p1.x = UPSCALE( control->x );
p1.y = UPSCALE( control->y );
p2.x = UPSCALE( to->x );
p2.y = UPSCALE( to->y );
/* short-cut the arc that crosses the current band */
if ( ( TRUNC( p0.y ) >= ras.max_ey &&
TRUNC( p1.y ) >= ras.max_ey &&
TRUNC( p2.y ) >= ras.max_ey ) ||
( TRUNC( p0.y ) < ras.min_ey &&
TRUNC( p1.y ) < ras.min_ey &&
TRUNC( p2.y ) < ras.min_ey ) )
{
ras.x = p2.x;
ras.y = p2.y;
return;
}
TPos dx = FT_ABS( p0.x + p2.x - 2 * p1.x );
TPos dy = FT_ABS( p0.y + p2.y - 2 * p1.y );
if ( dx < dy )
dx = dy;
if ( dx <= ONE_PIXEL / 4 )
{
gray_render_line( RAS_VAR_ p2.x, p2.y );
return;
}
/* We can calculate the number of necessary bisections because */
/* each bisection predictably reduces deviation exactly 4-fold. */
/* Even 32-bit deviation would vanish after 16 bisections. */
int shift = 0;
do
{
dx >>= 2;
shift += 1;
}
while (dx > ONE_PIXEL / 4);
/*
* The (P0,P1,P2) arc equation, for t in [0,1] range:
*
* P(t) = P0*(1-t)^2 + P1*2*t*(1-t) + P2*t^2
*
* P(t) = P0 + 2*(P1-P0)*t + (P0+P2-2*P1)*t^2
* = P0 + 2*B*t + A*t^2
*
* for A = P0 + P2 - 2*P1
* and B = P1 - P0
*
* Let's consider the difference when advancing by a small
* parameter h:
*
* Q(h,t) = P(t+h) - P(t) = 2*B*h + A*h^2 + 2*A*h*t
*
* And then its own difference:
*
* R(h,t) = Q(h,t+h) - Q(h,t) = 2*A*h*h = R (constant)
*
* Since R is always a constant, it is possible to compute
* successive positions with:
*
* P = P0
* Q = Q(h,0) = 2*B*h + A*h*h
* R = 2*A*h*h
*
* loop:
* P += Q
* Q += R
* EMIT(P)
*
* To ensure accurate results, perform computations on 64-bit
* values, after scaling them by 2^32:
*
* R << 32 = 2 * A << (32 - N - N)
* = A << (33 - 2 *N)
*
* Q << 32 = (2 * B << (32 - N)) + (A << (32 - N - N))
* = (B << (33 - N)) + (A << (32 - N - N))
*/
#ifdef __SSE2__
/* Experience shows that for small shift values, SSE2 is actually slower. */
if (shift > 2) {
union {
struct { FT_Int64 ax, ay, bx, by; } i;
struct { __m128i a, b; } vec;
} u;
u.i.ax = p0.x + p2.x - 2 * p1.x;
u.i.ay = p0.y + p2.y - 2 * p1.y;
u.i.bx = p1.x - p0.x;
u.i.by = p1.y - p0.y;
__m128i a = _mm_load_si128(&u.vec.a);
__m128i b = _mm_load_si128(&u.vec.b);
__m128i r = _mm_slli_epi64(a, 33 - 2 * shift);
__m128i q = _mm_slli_epi64(b, 33 - shift);
__m128i q2 = _mm_slli_epi64(a, 32 - 2 * shift);
q = _mm_add_epi64(q2, q);
union {
struct { FT_Int32 px_lo, px_hi, py_lo, py_hi; } i;
__m128i vec;
} v;
v.i.px_lo = 0;
v.i.px_hi = p0.x;
v.i.py_lo = 0;
v.i.py_hi = p0.y;
__m128i p = _mm_load_si128(&v.vec);
for (unsigned count = (1u << shift); count > 0; count--) {
p = _mm_add_epi64(p, q);
q = _mm_add_epi64(q, r);
_mm_store_si128(&v.vec, p);
gray_render_line( RAS_VAR_ v.i.px_hi, v.i.py_hi);
}
return;
}
#endif /* !__SSE2__ */
FT_Int64 ax = p0.x + p2.x - 2 * p1.x;
FT_Int64 ay = p0.y + p2.y - 2 * p1.y;
FT_Int64 bx = p1.x - p0.x;
FT_Int64 by = p1.y - p0.y;
FT_Int64 rx = ax << (33 - 2 * shift);
FT_Int64 ry = ay << (33 - 2 * shift);
FT_Int64 qx = (bx << (33 - shift)) + (ax << (32 - 2 * shift));
FT_Int64 qy = (by << (33 - shift)) + (ay << (32 - 2 * shift));
FT_Int64 px = (FT_Int64)p0.x << 32;
FT_Int64 py = (FT_Int64)p0.y << 32;
FT_UInt count = 1u << shift;
for (; count > 0; count--) {
px += qx;
py += qy;
qx += rx;
qy += ry;
gray_render_line( RAS_VAR_ (FT_Pos)(px >> 32), (FT_Pos)(py >> 32));
}
}
#else /* !BEZIER_USE_DDA */
/* Note that multiple attempts to speed up the function below
* with SSE2 intrinsics, using various data layouts, have turned
* out to be slower than the non-SIMD code below.
*/
static void
gray_split_conic( FT_Vector* base )
{
......@@ -1059,7 +1260,15 @@ typedef ptrdiff_t FT_PtrDist;
} while ( --draw );
}
#endif /* !BEZIER_USE_DDA */
/* For cubic bezier, binary splits are still faster than DDA
* because the splits are adaptive to how quickly each sub-arc
* approaches their chord trisection points.
*
* It might be useful to experiment with SSE2 to speed up
* gray_split_cubic() though.
*/
static void
gray_split_cubic( FT_Vector* base )
{
......@@ -1150,7 +1359,6 @@ typedef ptrdiff_t FT_PtrDist;
}
}
static int
gray_move_to( const FT_Vector* to,
gray_PWorker worker )
......@@ -1218,7 +1426,7 @@ typedef ptrdiff_t FT_PtrDist;
unsigned char* line = ras.target.origin - ras.target.pitch * y;
for ( ; cell != NULL; cell = cell->next )
for ( ; !CELL_IS_NULL(cell); cell = cell->next )
{
if ( cover != 0 && cell->x > x )
{
......@@ -1266,7 +1474,7 @@ typedef ptrdiff_t FT_PtrDist;
TArea area;
for ( ; cell != NULL; cell = cell->next )
for ( ; !CELL_IS_NULL(cell); cell = cell->next )
{
if ( cover != 0 && cell->x > x )
{
......@@ -1646,8 +1854,8 @@ typedef ptrdiff_t FT_PtrDist;
FT_TRACE7(( "band [%d..%d]: %ld cell%s\n",
ras.min_ey,
ras.max_ey,
ras.num_cells,
ras.num_cells == 1 ? "" : "s" ));
ras.cell_free - ras.cells.,
ras.cell_free - ras.cells == 1 ? "" : "s" ));
}
else
{
......@@ -1690,8 +1898,18 @@ typedef ptrdiff_t FT_PtrDist;
ras.cells = buffer + n;
ras.max_cells = (FT_PtrDist)( FT_MAX_GRAY_POOL - n );
ras.cell_limit = ras.cells + ras.max_cells;
ras.ycells = (PCell*)buffer;
/* Initialize the null cell is at the start of the 'cells' array. */
/* Note that this requires ras.cell_free initialization to skip */
/* over the first entry in the array. */
PCell null_cell = NULL_CELL_PTR(ras);
null_cell->x = CELL_MAX_X_VALUE;
null_cell->area = 0;
null_cell->cover = 0;
null_cell->next = NULL;;
for ( y = yMin; y < yMax; )
{
ras.min_ey = y;
......@@ -1705,15 +1923,17 @@ typedef ptrdiff_t FT_PtrDist;
do
{
TCoord width = band[0] - band[1];
TCoord w;
int error;
for (w = 0; w < width; ++w)
ras.ycells[w] = null_cell;
FT_MEM_ZERO( ras.ycells, height * sizeof ( PCell ) );
ras.num_cells = 0;
ras.cell = NULL;
ras.cell_free = ras.cells + 1; /* NOTE: Skip over the null cell. */
ras.cell = null_cell;
ras.min_ey = band[1];
ras.max_ey = band[0];
ras.count_ey = width;
error = gray_convert_glyph_inner( RAS_VAR, continued );
continued = 1;
......