This release features many library changes including a few new algorithms and substantial improvements in existing algorithms, notably making mesh simplification and clusterization better, as well as gltfpack fixes and improvements.
Library improvements
meshopt_simplify
now has an extra output parameter,result_error
, which will contain the relative simplification error (which can be converted to absolute withmeshopt_simplifyScale
)meshopt_simplifySloppy
interface has changed to align withmeshopt_simplify
: the function now expects a larger output index buffer size and accepts an input error that restricts simpification as well as an output error.meshopt_simplify
now tries to avoid simplifications that result in triangle flips; this substantially improves triangulation quality at a moderate performance cost.meshopt_buildMeshlets
interface has changed to allow for almost arbitrary meshlet vertex/triangle limits by outputting three separate arrays of meshlet data instead of one; the resulting layout is also more compact and is often more GPU-friendly.meshopt_buildMeshlets
now implements a new, more expensive algorithm that generates meshlets that are optimized for a balance of vertex reuse, spatial coherency and cone culling efficiency, controlled with an extracone_weight
parameter. The old linear-time algorithm is still available asmeshopt_buildMeshletsScan
.- Implement a new algorithm,
meshopt_generateTessellationIndexBuffer
, that can be used to generate a special index buffer that, together with hardware tessellation stage, can efficiently implement crack-free PN-AEN tessellation for arbitrary meshes. - Optimize Wasm SIMD variant of
meshopt_decodeVertexBuffer
, making it ~5% faster - Fix SIMD decoder filters (
meshopt_decodeFilter*
) when vertex count wasn't aligned by 4 - Fix undefined behavior when decoding some invalid compressed index buffers with
meshopt_decodeIndexBuffer
gltfpack improvements
- Implement support for KHR_materials_variants
- Implement support for recent versions of PBR-next extensions, including KHR_materials_volume and KHR_materials_specular
- Fix issues with running texture compression tools (toktx, basisu) in various environments
- Fix support for older versions of Node.js
- Fix processing for some scenes with clearcoat materials that didn't have a diffuse texture
- Fix handling of absolute paths in Node.js builds
- Fix processing for scenes with KHR_texture_transform extension when quantization is disabled
Assets
5
This release focuses on gltfpack improvements and also features small improvements to simplifier to improve quality for some edge cases.
gltfpack highlights
gltfpack improves support for instanced meshes substantially in this release. While previously all instances of the same mesh would be merged together unconditionally which could result in large file sizes and/or memory consumption, by default the instances are kept as is now; -mm
can be used to merge the geometry of the instances together, or alternatively -mi
can be used to encode the instance data using EXT_mesh_gpu_instancing
which, given a compatible loader, can significantly reduce the transmission size and improve loading and rendering performance.
To improve support for large scenes even further, gltfpack is now much more memory efficient, requiring ~40% less memory for processing on average.
The extension that's used by gltfpack to compress geometry, animation and instance data, is now part of glTF and is called EXT_meshopt_compression
; gltfpack was changed accordingly to output compressed files conforming the up-to-date specification. This requires loaders to update to the new extension; https://github.com/zeux/meshoptimizer/tree/master/js contains plugins for three.js and Babylon.js and work is underway to integrate these directly upstream.
For texture compression, gltfpack is switching to toktx
from KTX-Software; this enables support for super-compressed UASTC textures and support for texture scaling during encoding (via -ts
option) which can further reduce the file size. Additionally when using toktx
, gltfpack now pads the textures to a multiple of 4 to ensure compatibility with WebGL, and can optionally (via -tp
option) pad to a power of 2 for older browsers. basisu
command-line tool is still supported for now and automatically used if toktx
is not available.
Finally, gltfpack is now available as a JS library in addition to having command-line executables; the library uses a filesystem-like interface. Please refer to gltf/library.js
for documentation on the two exposed functions.
gltfpack improvements
- Improve support for scenes with many instances of the same mesh;
-mm
is now required to merge these instances together - Implement support for
EXT_mesh_gpu_instancing
via-mi
command line option -km
can now be used to keep unused materials-ke
now keepsextras
on nodes in addition to materials- Improve memory consumption when packing large scenes by 40% on average
- node.js version of gltfpack now supports texture compression if
basisu
ortoktx
are available - Update KTX2 support to track latest KTX2 specification, including DFD changes for ETC1S/UASTC
- Implement support for various PBR.Next extensions including
KHR_materials_transmission
,KHR_materials_ior
,KHR_materials_specular
andKHR_materials_sheen
- Implement support for
toktx
when compressing textures - Implement support for
-ts
that can be used to rescale textures to reduce transmission and memory size - Instead of using 1-255 range for texture quality,
-tq
now accepts a level from 1 to 10, which is tuned to balance compression ratio vs quality for both ETC1S and UASTC - Fix processing for files with unused texture coordinate 0
- Implement support for
-tp
that can be used to rescale images to power-of-two when using texture compression - Remove command line option
-tb
in favor of-tc
; the latter usesKHR_texture_basisu
which should be more widely supported - Remove command line option
-te
; textures are now automatically embedded into.glb
files - Implement JSON report via
-r
option which contains various stats about the resulting glTF scene - Fix texture embedding for images with spaces in the URI
- Fix issues with non-uniform and negative mesh scale
- Implement support for multiple scenes; all scenes are now preserved along with their own node hierarchy
- Implement support for higher bitrate colors via
-vc
option - Fix animation range in some cases, in particular starting time is now preserved when it's not 0, and ending time is preserved when animation doesn't have motion
Miscellaneous improvements
- Improve
meshopt_simplify
edge analysis to track edge loops more carefully; this fixes simplification for some cases where an open border would previously get collapsed incorrectly - Fix a few issues with CMake configuration when meshoptimizer is used as a dependent library
- Fix compilation for old Apple Clang versions
- Reduce size of
meshopt_decoder.js
by 40% before gzip and 5% after gzip meshopt_decoder.js
now has an ES6-friendly variant,meshopt_decoder.module.js
, that can be imported.
Assets
5
This release features several new algorithms, mainly aimed at improving the geometry compression, as well as many gltfpack changes with the same goal.
New algorithms
meshopt_optimizeVertexCacheStrip
optimizes triangle lists for vertex cache, favoring long triangle strips over vertex transform efficiency. This function is recommended to use as a replacement formeshopt_optimizeVertexCache
when reducing the compressed geometry size is more valuable than reducing vertex transform cost, or when usingmeshopt_stripify
to produce shorter triangle strip sequences.meshopt_encodeIndexBuffer
now supports the new strip-optimized order better; this required some bitstream changes that can be enabled withmeshopt_encodeIndexVersion(1)
. Version 1 will become the default encoding version in a later release.meshopt_encodeIndexSequence
can be used to compress index buffer data that doesn't represent triangle lists; the encoding is recommended for triangle strip or line lists, but can work with any index sequence (it's less efficient thanmeshopt_encodeIndexBuffer
at compressing triangle lists)
When compressing geometry, using meshopt_optimizeVertexCacheStrip
and meshopt_encodeIndexVersion(1)
is recommended to minimize the distribution size of the resulting meshes; this can make the encoded data ~10% smaller before gzip/zstd compression and up to 20% smaller after gzip/zstd.
Additionally, a set of vertex filters (meshopt_decodeFilterOct
, meshopt_decodeFilterQuat
, meshopt_decodeFilterExp
) was added to support MESHOPT_compression
glTF extension; these are not as useful outside of glTF, and are described in detail in the extension draft. Cumulatively these can substantially reduce the geometry and animation data in glTF files compressed using the extension.
gltfpack highlights
gltfpack incorporates the new algorithms and filters to substantially improve the compression ratios for geometry and animation data. For example, Corset
model from glTF-Sample-Models repository is 20% smaller, BrainStem
model from the same repository is 30% smaller. Most of the changes currently require using a higher compression mode, activated via -cc
command-line option; in a future release -cc
may replace -c
.
The texture compression support was updated to incorporate latest changes in KTX2 / KHR_texture_basisu specification; additionally, gltfpack now supports Basis UASTC encoding via -tu
flag. Note that since gltfpack doesn't support UASTC RDO yet, the UASTC compressed files will be much larger (but much higher quality) compared to ETC1S encoded files.
For easier distribution, gltfpack is now available as an npm package.
gltfpack improvements
- Support all primitive topology modes, except indexed point lists, as an input
- Support for line lists as an output; line meshes were previously discarded
- Improve filtering of redundant geometry streams (removing color/morph delta streams as necessary)
- Implement support for
KHR_materials_clearcoat
extension - Preserve
extras
data on material instances when-ke
flag is used - Add fine-grained control over quantization parameters for animations (
-at
,-ar
,-as
) - Add
-noq
option that can be used to disable quantization (resulting in much larger files) - Improve performance on large scenes with lots of mesh instances
- Improve validation and error messages for invalid input files
- Fix invalid output for files with meshes that don't produce any geometry
Miscellaneous improvements
meshopt_decodeVertexBuffer
now automatically enables SSSE3 SIMD implementation for clang/gcc using__cpuid
-based runtime detection without the need to use extra compile flagsmeshopt_encodeVertexBuffer
now works correctly on empty inputs (count = 0
)- CMake scripts now support CMake versions older than 3.7
- CMake options are now prefixed with
MESHOPT_
(note: this breaks shared library builds, fixed in #129)
Assets
5
This release has several new algorithms, SIMD improvements for vertex codec and a lot of gltfpack changes including Basis support.
New algorithms
meshopt_simplifyPoints
can be used to simplify point clouds. The algorithm is a variant of sloppy simplifier, which means it's fast and not attribute-aware (for now).meshopt_spatialSortRemap
andmeshopt_spatialSortTriangles
can be used to reorder vertices or triangles to increase spatial locality. This is helpful when working with point clouds and triangle meshes with redundant connectivity, and can improve clusterization results.
Performance improvements
meshopt_decodeVertexBuffer
now has an experimental AVX512 implementation, which is ~10% faster than SSSE3 implementation (it uses 128b vectors and as such carries no extra power cost). It requires AVX512-VBMI2 and AVX512-VL (available on Ice Lake CPUs).meshopt_decodeVertexBuffer
now has an experimental WebAssembly SIMD implementation, which is ~3x faster than scalar implementation. It requires a compatible WebAssembly implementation with SIMD enabled (Chrome Canary was used for testing).- WebAssembly decoders are now compiled using upstream Emscripten compiler backend, which results in ~5% faster decoding across the board.
Miscellaneous improvements
- All allocations now use allocation callbacks that can be set through
meshopt_setAllocators
; previously, allocations frommeshopt_IndexAdapter
were using global operator new/delete. - CMake build system now supports BUILD_SHARED_LIBS
- CMake build system now can install gltfpack and libmeshoptimizer upon request
gltfpack highlights
This change includes a lot of work on extension specification. As a result, MESHOPT_quantized_geometry extension that was being used before got replaced with a new KHR_mesh_quantization extension (extension PR), and the details of MESHOPT_compression extension have changed substantially to allow for fallback data (extension PR), requiring updates to GLTF loaders. Both three.js (r111) and Babylon.JS (4.1) can be used to load these files, with a custom demo/GLTFLoader.js
for three.js and an extension demo/babylon.MESHOPT_compression.js
for Babylon.JS.
As a result, gltfpack-produced files now validate cleanly with the most recent glTF validator build (2.0.0-dev.3.0 (November 2019)).
gltfpack also now supports Basis Universal texture supercompression. Encoding files with these textures requires basisu
executable which can be built from the official repository. Two container format options are provided:
.basis
- native container format for Basis; this is supported by three.js and Babylon.JS today, but is likely to be removed in the future because this is not compatible with glTF specification.ktx
- KTX2 container format from Khronos that supports Basis supercompression; this is not supported by any renderer at the time of this writing, but this is the route that is being specified (spec PR).
In addition, there were a lot of changes aimed at increasing efficiency and extending feature support, with the full list below.
gltfpack improvements
- Switch from MESHOPT_quantized_geometry to KHR_mesh_quantization
- gltfpack-produced files now validate cleanly with the most recent build of glTF validator (PR)
- Update
MESHOPT_compression
specification, requires updating JSON loaders (GLTFLoader.js) - Implement support for arbitrary number of input bone influences (largest 4 weights are preserved)
- Implement degenerate triangle filtering (5% triangle/size savings on some models)
- Use 8-bit morph target deltas when possible (depending on the model, up to 2x memory savings, ~3% size savings); requires three.js r111 to work correctly
- Add
-cf
command line option to support compressed data fallback; files produced with this option don't requireMESHOPT_compression
extension, but loaders that support it will not need to load uncompressed data - Add
-si R
and-sa
flags that simplify the meshes using default/aggressive (sloppy) simplification - By default, gltfpack now produces normalized normals/tangents; this results in larger but specification-compliant files. This will be improved later, for now you can use
-vu
to get better compression by using unnormalized normals/tangents. - Impement support for Basis / KTX2 compression (
-tb
to compress textures usingbasisu
into.basis
container;-tc
to compress textures using KTX2 container which requires extra extensions and isn't supported by renderers yet) - Implement support for embedding texture files into buffers (
-te
flag) - Implement support for point clouds
- Improve animation compression efficiency for translation/scale data by reducing output precision slightly.
- Improve efficiency of bone influence encoding (~1% size savings)
- A few correctness fixes, including non-uniform scale handling and quantized color/weight data parsing
- Morph target names are now preserved using
extra.targetNames
JSON array
Assets
5
This release contains a few improvements for various algorithms, introduces support for triangle strips with degenerate triangles and adds gltfpack (alpha).
Interface changes:
meshopt_stripify
andmeshopt_unstripify
now require an extra argument,restart_index
Improvements:
- Improve
meshopt_simplifySloppy
performance by up to 10% by using three-point interpolation search - Improve results of
meshopt_optimizeVertexCache
by up to 0.5% by using a new data set obtained with differential evolution meshopt_stripify
now supports stitching strips using degenerate triangles instead of restart indices; this typically results in a 10% larger index buffer compared to restart indices, but on some GPUs it can be substantially faster to render
gltfpack:
This release introduces an alpha vesion of gltfpack. gltfpack is a command-line tool that converts .obj or .gltf files to glTF files that are optimized for render performance and transmission time. gltfpack merges meshes and materials to reduce draw call count, merges buffers to reduce draw setup cost, quantizes vertex attributes to reduce GPU memory footprint, optimizes vertex and index data for more efficient GPU rendering, resamples and quantizes animation data to reduce memory footprint, and can optionally compress the vertex/index/animation buffers in the output using meshoptimizer codecs to further reduce the file size.
The resulting files rely on two not-yet-standardized extensions; when compression is not used, the resulting files can be loaded using three.js (r107+) and Babylon.js (4.1+) glTF loaders. Loading compressed files requires integrating JavaScript decoders (js/meshopt_decoder.js
); demo/GLTFLoader.js
contains a custom version of three.js loader that can be used to load them.
Assets
2
This release contains a few improvements for simplifier, introduces a new simplification algorithm, adds support for custom allocators and improves performance and code size of JavaScript decoders.
Interface changes:
meshopt_computeMeshletBounds
now passesmeshlet
parameter by pointer instead of by value.
New algorithms:
- Introduce a new simplification algorithm,
meshopt_simplifySloppy
, that performs decimation without concerns for topological integrity. The algorithm can and will merge small disjoint features together, and is extremely fast at ~20M triangles/sec on large meshes on modern desktop CPUs. - Memory allocation can now be configured to use custom allocation callbacks using
meshopt_setAllocator
.
Improvements:
- Default simplifier now uses normalized error metric, which makes it much easier to consistently configure
target_error
parameter - it now corresponds to linear error, normalized to mesh radius (0.01 means 1% deviation). - Fix edge cases when default simplifier could run many passes in vain, resulting in poor performance.
- Improve JavaScript decoder performance: vertex decoding is 17% faster, index decoding is 1.7x faster.
- Improve JavaScript decoder size:
decoder.js
is now 2.4x smaller (3.5 KB after gzip)
Compatibility:
- Fix gcc -Wshadow warnings
- Work around a bug in Edge ChakraCore compiler that could result in indices being incorrectly decoded with
decoder.js
.
Assets
2
This release contains a number of fixes and improvements for vertex codec, substantially improves performance of several algorithms in Debug builds and introduces support for decompressing vertex/index data from JavaScript.
New algorithms:
- Introduce an experimental algorithm,
meshopt_generateVertexRemapMulti
, that generates the same remap table asmeshopt_generateVertexRemap
for indexing a mesh, but supports vertex data stored as multiple independent streams (deinterleaved) - Introduce an experimental algorithm,
meshopt_generateShadowIndexBufferMulti
, that can generate a second index buffer that shares the vertex data with the original index buffer, but supports vertex data stored as multiple independent streams (deinterleaved)
Improvements:
- Optimize NEON code in
meshopt_decodeVertexBuffer
, making it 1-2% faster - Improve compatibility of SIMD code in
meshopt_decodeVertexBuffer
, fixing compilation issues on ARM64, MSVC ARM, and clang for Windows - Fix a bug in
meshopt_encodeVertexBuffer
that resulted in incorrectly encoded data on platforms wherechar
isunsigned
(this mostly affected ARM hosts such as Android) - Substantially improve performance of multiple algorithms in Debug:
meshopt_analyzeVertexCache
is 6x fastermeshopt_optimizeVertexCache
is 4.7x fastermeshopt_analyzeOverdraw
is 3.9x fastermeshopt_optimizeOverdraw
is 1.4x fastermeshopt_simplify
is 1.3x faster
JavaScript support:
- Introduce
js/decoder.js
that contains a WebAssembly version of vertex and index decoders with a JavaScript-friendly interface. The decoders run at 200-400 MB/s on modern desktop CPUs. - Introduce
tools/OptMeshLoader.js
that contains an example mesh loader for THREE.js that uses vertex/index codecs for compression and quantizes vertex data for efficient storage; the meshes for this loader can be produced bytools/meshencoder.cpp
using .OBJ files as an input.
Assets
2
This release substantially improves mesh simplification and introduces experimental algorithms for advanced GPU mesh rendering (cone culling, meshlet construction). The library can also now be used from Rust via https://crates.io/crates/meshopt.
Interface changes:
meshopt_simplify
has an extra argument,target_error
, that can be used to limit the geometric error introduced by the simplifier
New algorithms:
- Introduce an experimental algorithm,
meshopt_buildMeshlets
, that can create meshlet data from index buffer that can be used to efficiently drive the mesh shading pipeline in NVidia RTX GPUs - Introduce experimental algorithms,
meshopt_computeClusterBounds
andmeshopt_computeMeshletBounds
, that can compute bounding sphere and bounding normal cone for use in GPU cluster culling. - Introduce an experimental algorithm,
meshopt_generateShadowIndexBuffer
, that can generate a second index buffer that shares the vertex data with the original index buffer, but is more efficient when a subset of vertex attributes is needed.
Improvements:
- Significantly rework
meshopt_simplify
to improve simplification quality, including error metric improvements, attribute-guided collapse that preserves UV seam structure better, and other tweaks - Significantly rework and optimize
meshopt_simplify
, making it ~4x faster - Optimize
meshopt_generateVertexRemap
, making it 1.25x faster - Optimize
meshopt_decodeVertexBuffer
for platforms without SIMD support, making it 1.1x faster - Fix undefined behavior (left shift of negative integer) in
meshopt_encodeVertexBuffer
Assets
2
This release introduces vertex buffer encoder and a stable version of index buffer encoder.
New algorithms:
- Introduce vertex encoder that compresses vertex buffers; it can be invoked using
meshopt_encodeVertexBuffer
andmeshopt_decodeVertexBuffer
. The algorithm typically provides 1.5-2x compression ratio for quantized vertex data, and the resulting data can be compressed further by a general purpose compressor like zstd. Decoding is highly optimized using SSSE3/NEON and runs at 2 GB/s on a modern desktop CPU. - Introduce a stable index encoder that compresses index buffers; it can be invoked using
meshopt_encodeIndexBuffer
andmeshopt_decodeIndexBuffer
. The algorithm typically encodes index buffers using ~3-4 bits per index, and the resulting data can be compressed further by a general purpose compressor like zstd, yielding ~2-3 bits per index for most meshes. Decoding is highly optimized and runs at 2 GB/s on a modern desktop CPU for 32-bit indices (1 GB/s for 16-bit indices). - Introduce a new algorithm to optimize for vertex fetch,
meshopt_optimizeVertexFetchRemap
; it generates a remap table that can be used withmeshopt_remapVertexBuffer
/meshopt_remapIndexBuffer
and helps optimizing meshes with several vertex streams.
Improvements:
- Optimize cluster sorting in
meshopt_optimizeOverdraw
, making the function 10% faster - Optimize index decoder, making it 15% faster for 32-bit indices and 40% faster for 16-bit indices
- Fix
meshopt_analyzeVertexCache
andmeshopt_analyzeVertexFetch
results for sparse vertex buffers (with unused vertices) - Support in-place optimization in
meshopt_remapVertexBuffer
- Improve CMake build files to make the library easier to integrate
Assets
2
This release has large interface changes and introduces several new algorithms and tweaks to existing algorithms.
Interface:
- All C++ function wrappers have been moved out of
meshopt
namespace and gainedmeshopt_
prefix to simplify documentation & interface - All structs used by the interface have been renamed and now also have
meshopt_
prefix to avoid name conflicts meshopt_quantizeX
functions now use function arguments instead of template parameters for better compatibilitycache_size
argument has been removed frommeshopt_optimizeVertexCache
andmeshopt_optimizeOverdraw
; to perform optimization for a FIFO cache of a fixed size, usemeshopt_optimizeVertexCacheFifo
New algorithms:
- Introduce an algorithm that compresses index buffers; it can be invoked using
meshopt_encodeIndexBuffer
andmeshopt_decodeIndexBuffer
. The algorithm typically encodes index buffers using ~3-4 bits per index, and the resulting data can be compressed further by a general purpose compressor like zstd, yielding ~2-3 bits per index for most meshes. - Introduce an algorithm that can convert an index buffer to a triangle strip that is still reasonably cache efficient; indexed triangle strips are faster to render on some hardware and can reduce the index buffer size. The algorithm can be invoked using
meshopt_stripify
and typically produces buffers with around 60-65% indices compared to triangle lists, and a 5-10% ACMR penalty on GPUs with small caches. - Introduce a new quantization function,
meshopt_quantizeFloat
, that can reduce the precision of a floating-point number while keeping the floating-point representation. This can be useful to generate vertex data that can be compressed more effectively using a general purpose compression algorithm.
Improvements:
- Overdraw analyzer (
meshopt_analyzeOverdraw
) now uses a pixel center fill convention to match hardware rendering more closely. - Vertex cache analyzer (
meshopt_analyzeVertexCache
) now models cache that matches real hardware a bit more closely, and requires additional parameters to configure (namely, primitive group size and warp/wavefront size). - Vertex cache optimizer (
meshopt_optimizeVertexCache
) has been tuned to generate better output that performs well on real hardware, especially given meshes that have topology similar to that of a uniform grid as an input. - Various algorithms have been optimized for performance and memory consumption.