-
Updated
Jul 23, 2021 - C++
#
avx
Here are 166 public repositories matching this topic...
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512 for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.
c-plus-plus
machine-learning
arm
neural-network
neon
image-processing
avx
sse
simd
avx2
sse2
sse41
avx512
powerpc
altivec
vsx
ssse3
simd-library
haar-cascade
lbp
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
cpp
neon
c-plus-plus-11
avx
sse
simd
vectorization
avx512
mathematical-functions
simd-instructions
simd-intrinsics
-
Updated
Jul 24, 2021 - C++
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
audio
cplusplus
dft
cxx
travis-ci
dsp
cpp14
intel
avx
clang
simd
header-only
fast-fourier-transform
cpp17
cplusplus-14
fft
digital-signal-processing
avx512
ser
audio-processing
cplusplus-17
discrete-fourier-transform
-
Updated
May 14, 2021 - C++
SIMD Vector Classes for C++
c-plus-plus
cpp
portable
neon
cpp14
parallel
parallel-computing
avx
sse
cpp11
simd
cpp17
avx2
simd-programming
vectorization
avx512
simd-instructions
simd-vector
data-parallel
-
Updated
Jul 8, 2021 - C++
c
euler
opengl
math
postfix
neon
vector
matrix
bezier
avx
sse
simd
affine-transform-matrices
opengl-math
3d
bounding-boxes
matrix-decompositions
frustum
3d-math
marix-inverse
glm-for-c
-
Updated
Jun 15, 2021 - C
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
-
Updated
Jul 15, 2019
Accelerate SHA256 computations in pure Go using Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
-
Updated
Jun 17, 2021 - Go
SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
android
ios
arm
neon
cuda
avx
simd
elementary-functions
sse2
fft
vectorization
math-library
aarch64
avx512
powerpc
vsx
vector-math
s390x
quadruple-precision
sve
-
Updated
Jul 9, 2021 - C
BitMagic Library
c
c-plus-plus
information-retrieval
cmake
algorithm
avx
bit-manipulation
simd
integer-compression
sparse-vectors
sparse-matrix
bit-array
indexing-engine
bit-vector
adjacency-matrix
associative-array
sparse-vector
-
Updated
Jul 23, 2021 - C++
Math library using hlsl syntax with SSE/NEON support
math
cpp
shaders
neon
c-plus-plus-11
vector
matrix
modern-cpp
game-development
avx
sse
quaternion
variants
hlsl
sse41
math-library
ser
-
Updated
May 3, 2021 - C++
Examples of C# code compiled to GPU by hybridizer
visual-studio
compiler
dotnet
gpu
optimization
parallel
cuda
avx
avx2
vectorization
avx512
hybridizer-essentials
-
Updated
Sep 5, 2019 - C#
Fast inference engine for Transformer models
deep-neural-networks
cpp
neon
openmp
parallel-computing
cuda
avx
intrinsics
avx2
neural-machine-translation
opennmt
quantization
gemm
mkl
thrust
transformer-models
onednn
-
Updated
Jul 22, 2021 - C++
Open Source Architecture Code Analyzer
python
hpc
latency
assembly
avx
x86
throughput
avx2
performance-analysis
avx512
out-of-order
critical-path
port-mapping
performance-modeling
arm64v8
sve
in-core
loop-carried-dependency
-
Updated
Jul 21, 2021 - Jupyter Notebook
Agenium Scale vectorization library for CPUs and GPUs
hpc
neon
cuda
avx
simd
avx2
sse2
simd-programming
aarch64
avx512
simd-instructions
simd-library
sse42
rocm
cpp20
sve
neon128
cpp20-library
vectorization-library
-
Updated
Jul 21, 2021 - Python
python
c
openmp
avx
simd
cosmology
astrophysics
galaxies
large-scale-structure
pair-counting
intrinsics
avx2
avx512
sse42
correlation-functions
-
Updated
Jul 22, 2021 - C
Expressive Velocity Engine - SIMD in C++ Goes Brrrr
cpp
hpc
neon
avx
simd
avx2
sse2
mit-license
simd-programming
cpp-library
aarch64
simd-parallelism
altivec
ssse3
simd-library
sse3
cpp20
sse4
cpp20-library
-
Updated
Jul 24, 2021 - C++
Turbo Base64 - Fastest Base64 SIMD/Neon/Altivec
encoding
benchmark
arm
library
base64
neon
avx
sse
simd
avx2
base64-encoding
base64-decoding
encoding-library
-
Updated
Aug 17, 2020 - C
An AVX Lifter for the Hex-Rays Decompiler
-
Updated
Jul 22, 2020 - Python
ihhub
commented
Jan 22, 2020
We have Bitmap image saving function void Save( const std::string & path, const penguinV::Image & image, uint32_t startX, uint32_t startY, uint32_t width, uint32_t height )
which locates in src/file/bmp_image.h and src/file/bmp_image.cpp files.
During file saving we purposely copy a line of image to temporary array and then write the array into file. The reason behind this is that
UME::SIMD A library for explicit simd vectorization.
benchmark
cpp
neon
vector
cpp14
avx
cpp11
simd
performance-tuning
cpp17
code-generation
avx2
simd-programming
vectorization
avx512
simd-instructions
altivec
instruction-set-architecture
scalar-types
ume
-
Updated
Jan 19, 2018 - C++
Improve this page
Add a description, image, and links to the avx topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the avx topic, visit your repo's landing page and select "manage topics."
The WebAssembly people are working on a relaxed SIMD proposal which mostly just provides alternatives for already-implemented functions, but allows for some differences between different implementations (e.g., allowing different results for out-of-range values, NaNs, etc.).
This should be pretty easy issue to resolve; we can mostly just copy the