Releases
v6.1.145
Algorithms
New features
Parameter add in function SimdSynetMergedConvolution16bInit.
Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetTiledScale2D32f.
AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w6 for class SynetMergedConvolution16b.
AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w4 for class SynetMergedConvolution16b.
AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w8 for class SynetMergedConvolution16b.
AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w6 for class SynetMergedConvolution16b.
Base implementation, SSE4.1 optimizations of class ResizerBf16Bilinear.
Improving
Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k7p3d1s1w4.
Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k7p3d1s1w6.
Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k7p3d1s1w8.
Extend using of AVX-512BW optimization of function Convolution32fNhwcDepthwise_k7p3d1s1w4.
Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k5p2d1s1w8.
Performance of SynetConvolution32f (NHWC, srcC=1, dstС=1).
Bug fixing
Error in AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
Error in AVX-512BW optimizations of class SynetAdd16bUniform.
Error in AMX-BF16 optimizations of function DepthwiseConvolutionDefault.
Error in AMX-BF16 optimizations of function DepthwiseConvolutionLargePad.
Error in Base implementation of class SynetMergedConvolution16bCdc.
Error in Base implementation of class SynetMergedConvolution16bCd.
Error in class InputMemoryStream.
Removing
Parameter compatibility in function SimdSynetMergedConvolution16bInit.
Parameter internal in function SimdSynetMergedConvolution16bSetParams.
Test framework
New features
Tests for verifying functionality of function SynetTiledScale2D32f.
You can’t perform that action at this time.