Skip to content

v0.18.0

Compare
Choose a tag to compare
@oleksandr-pavlyk oleksandr-pavlyk released this 30 Sep 10:42
· 426 commits to master since this release
786365e

This release reaches an important milestone of making offloading fully asynchronous.

Calls to dpctl.tensor submit tasks for execution to DPC++ runtime and return without waiting for execution of these tasks to finish.
The sequential semantics a user comes to expect from execution of Python script is preserved though.

The full list of changes that went into this release are:

Added

  • Implement tensor.take_along_axis per Python Array API specification gh-1778
  • Implement tensor.put_along_axis to complement tensor.take_along_axis gh-1798
  • Support for 'device=tensor.kDLCPU' in tensor.from_dlpack function and tensor.usm_ndarray.__dlpack__ method gh-1781
  • Support DLPack on Windows gh-1746
  • Implement tensor.nextafter function per Python Array API specification gh-1730
  • Implement tensor.count_nonzero and tensor.diff functions from Python array API specification gh-1732, gh-1780
  • Add support for order="K" to *_like array creation functions, and change default order keyword value from 'C' to 'K' gh-1808
  • Support for 'max dimensions' in Array API capabilities info data gh-1774
  • Add support for device aspect 'emulated' gh-1691
  • dpctl::tensor::usm_memory class defined in dpctl4pybind11.hpp adds constructor to create Python USM memory objects viewing into existing USM allocations, which can be made by an external library gh-1782
  • Add support for COVERAGE build type in project's CMake script gh-1692

Change

  • Change ownership of USM allocation by dpctl.memory objects, make executions of dpctl.tensor operations asynchronous gh-1705
  • Add support for Python scalars by tensor.where function gh-1719
  • Optimize division by Python scalar in statistical functions tensor.mean, tensor.std, tensor.var gh-1820
  • Use transcendental functions from sycl namespace instead of std namespace gh-1707
  • Changes for compatibility with recent NumPy in runtime environment gh-1735, gh-1772, gh-1804
  • Array creation function tensor.zeros to use asynchronous memset operation gh-1806
  • The setter of tensor.usm_ndarray.shape property now supports Python scalar value gh-1786
  • Use 'pyproject.toml' instead of 'setup.py' aligning with current packaging best practices gh-1660
  • No longer set SOVERSION property in DPCTLSyclInterface library on Linux gh-1773
  • Update version of 'pybind11' used gh-1758, gh-1812
  • Handle possible exceptions by usm_host_allocator used with std::vector gh-1791
  • Use dpctl::tensor::offset_utils::sycl_free_noexcept instead of sycl::free in host_task tasks associated with life-time management of temporary USM allocations gh-1797
  • Add "same_kind"-style casting for in-place mathematical operators of tensor.usm_ndarray gh-1827, gh-1830

Fixed

  • Fix setting of release variable Sphinx config file gh-1685
  • Handle possible NULL return value from device aspect queries DPCTLDevice_GetMaxWorkGroupSize1d and DPCTLDevice_GetMaxWorkGroupSize2d gh-1690
  • Add license header to conda script files gh-1695
  • Fix tensor.round behavior on CUDA devices gh-1700
  • Add missing #include <sstream> gh-1701
  • Fix for issue 1724 gh-1728
  • Correct USM type for return array of tensor.extract function gh-1727
  • Fix for tensor.unique_all and tensor.unique_inverse to always return index arrays with default indexing data type gh-1741
  • Propagate read-only flag from __sycl_usm_array_interface__ in tensor.asarray function gh-1756
  • tensor.clip to handle Python scalars which are out of bound for the data type of integral array gh-1759
  • Avoid dead-locking by releasing GIL around blocking operations in libtensor gh-1753
  • Element-wise tensor.divide and comparison operations allow greater range of Python integer and integer array combinations gh-1771
  • Fix for unexpected behavior when using floating point types for array indexing gh-1792
  • Enable pytest --pyargs dpctl.tests gh-1833

Maintenance

  • Improve performance of test_sort_complex_fp_nan gh-1704
  • Improve exception wording raised by tensor.broadcast_arrays() gh-1720
  • Remove template keyword in method call of sycl::kernel_bundle gh-1726
  • Backport changelog edits from maintenance/0.17.x gh-1736
  • Replace uses of 'intel' channels in docs and readme file gh-1737
  • Update references to deprecated environment variable SYCL_DEVICE_FILTER gh-1740
  • Correction for installation instruction steps gh-1754
  • Fix for crash during testing with open source SYCL bundle by updating CPU RT library used gh-1762
  • Add missing include to fix build break with newer LLVM gh-1776
  • Add #include <utility> for definition of std::move used gh-1787
  • Change to CMake script to accomodate DPC++ transition from PI to UR architecture gh-1788
  • Document tensor._flags.Flags class gh-1794
  • Fix for unreferenced unreleased bug in copy-and-cast code logic gh-1799
  • Explicitly include headers used in C++ translation units implementing reduction operations gh-1802
  • Clean-up uses of Strided1DIndexer class gh-1805
  • Tweak to readability of C++ code implementing matrix-matrix multiplication gh-1810
  • Do not add sycl::event associated with compute task to vector of events representing execution of host_task gh-1807
  • Remove 'level-zero' conda package from run-time dependencies of 'dpctl' since Intel GPU driver stack now explicitly depends on libze1 package which provides Level-Zero loader library gh-1801, gh-1840
  • Use dedicated type-support matrices for in-place element-wise binary operations gh-1816
  • Remove recommendation to install wheels from Anaconda PyPI index gh-1819
  • Removed use of post-link and pre-unlink conda scripts in dpctl gh-1821
  • Pin compiler used to build 0.18.0 version to 2025.0.0 gh-1822
  • A varienty of changes to continuous integration/delivery (CI/CD) supporting scripts to keep CI running smoothly:
    gh-1686, gh-1688, gh-1697, gh-1698, gh-1703, gh-1702, gh-1709, gh-1712, gh-1713, gh-1722, gh-1725, gh-1729, gh-1733, gh-1721, gh-1743, gh-1739, gh-1747, gh-1748, gh-1750, gh-1752, gh-1767, gh-1768, gh-1775, gh-1783, gh-1790, gh-1795, gh-1796, gh-1800, gh-1760, gh-1803, gh-1777, gh-1813, gh-1817, gh-1818