Skip to content

Simd v61.145

Latest
Compare
Choose a tag to compare
@ermig1979 ermig1979 released this 01 Jan 21:01
· 1 commit to master since this release

Algorithms

New features
  • Parameter add in function SimdSynetMergedConvolution16bInit.
  • Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetTiledScale2D32f.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w6 for class SynetMergedConvolution16b.
  • AMX-BF16 kernel DepthwiseConvolution_k5p2d1s1w4 for class SynetMergedConvolution16b.
  • AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w8 for class SynetMergedConvolution16b.
  • AMX-BF16 kernel DepthwiseConvolution_k3p1d1s1w6 for class SynetMergedConvolution16b.
  • Base implementation, SSE4.1 optimizations of class ResizerBf16Bilinear.
Improving
  • Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k7p3d1s1w4.
  • Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k7p3d1s1w6.
  • Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k7p3d1s1w8.
  • Extend using of AVX-512BW optimization of function Convolution32fNhwcDepthwise_k7p3d1s1w4.
  • Extend using of AMX-BF16 optimization of function DepthwiseConvolution_k5p2d1s1w8.
  • Performance of SynetConvolution32f (NHWC, srcC=1, dstС=1).
Bug fixing
  • Error in AMX-BF16 optimizations of class SynetInnerProduct16bGemmNN.
  • Error in AVX-512BW optimizations of class SynetAdd16bUniform.
  • Error in AMX-BF16 optimizations of function DepthwiseConvolutionDefault.
  • Error in AMX-BF16 optimizations of function DepthwiseConvolutionLargePad.
  • Error in Base implementation of class SynetMergedConvolution16bCdc.
  • Error in Base implementation of class SynetMergedConvolution16bCd.
  • Error in class InputMemoryStream.
Removing
  • Parameter compatibility in function SimdSynetMergedConvolution16bInit.
  • Parameter internal in function SimdSynetMergedConvolution16bSetParams.

Test framework

New features
  • Tests for verifying functionality of function SynetTiledScale2D32f.