19 Jul 12:32

oleksandr-pavlyk

3794cbc

v0.14.4

This is hot-fix for 0.14.3 release.

Added

Added dpctl.tensor.less_equal, dpctl.tensor.greater, dpctl.tensor.greater_equal: #1239

Changed

Optimized in-place arithmetic operations for updating matrix with rows/columns via broadcasting: #1244

Fixed

Fixed handling of 0d arrays in dpctl.tensor.sum: #1238

Assets 4

19 Jul 12:31

oleksandr-pavlyk

0.14.3

81553f8

v0.14.3

Added

Added support of axis=None in dpctl.tensor.concat #1125
Added caching for dpctl.SyclDevice.filter_string property #1127
Added dpctl.tensor.isdtype from array API #1133
Added dpctl.tensor.unstack, dpctl.tensor.moveaxis, dpctl.tensor.swapaxes #1137, #1174
Allow for mutation of dpctl.tensor.usm_ndarray.flags.writable #1141
Added dpctl.tensor.where from array API #1147
Include libtensor headers in dpctl installation layout #1185
Added new properties of dpctl.tensor.usm_ndarray object #1199
Added a list of unary and binary elementwise functions from array API:
- #1203: dpctl.tensor.add, dpctl.tensor.divide, dpctl.tensor.isnan, dpctl.tensor.isinf, dpctl.tensor.isfinite, dpctl.tensor.cos, dpctl.tensor.abs, dpctl.tensor.equal
- #1205: dpctl.tensor.sqrt
- #1209: implements out keyword argument
- #1211: dpctl.tensor.multiply, dpctl.tensor.subtract
- #1214: dpctl.tensor.not_equal
- #1216: dpctl.tensor.exp, dpctl.tensor.sin
- #1217: dpctl.tensor.real, dpctl.tensor.imag, dpctl.tensor.proj
- #1218: dpctl.tensor.log, dpctl.tensor.log1p, dpctl.tensor.expm1
- #1221: dpctl.tensor.floor_divide
- #1235: dpctl.tensor.less
- #1237: in-place support for addition, multiplication and subtraction
Added dpctl.tensor.all and dpctl.tensor.any #1204
Added dpctl.tensor.sum #1210

Changed

Updated examples of native Python extensions built using dpctl #1108
Used security flags to compile and link native extensions of dpctl #1109
Changed types of dpctl.tensor.finfo and dpctl.tensor.iinfo output structure per array API spec #1110
Consolidated multiple USM temporaries life-time management host_tasks to improve test suite stability #1111
MAINT: Improved cmake target dependency tracking #1112
MAINT: Improved docstrings for existing dpctl.tensor functions #1123
Changed default value of mode keyword in dpctl.tensor.take and dpctl.take.put from clip to wrap #1132
Added support for (nested) sequence of dpctl.tensor.usm_ndarray objects in dpctl.tensor.asarray #1139
Improved exception handling in dpctl.tensor.usm_ndarray.__setitem__ special method #1146
Simplified implementation of copy-and-cast kernels and removed special casing for 2D arrays to conserve binary size #1165
Improved speed of dpctl.tensor.usm_ndarray printing functionality #1187
Require DPC++ RT 2023.1 to build and run dpctl #1195
Compile offloading native extensions with -fno-sycl-id-queries-fit-in-int fixing gh-1184, #1200
Transition to conda-forge ecosystem #1213

Fixed

Fix to add empty values check for dpctl.tensor.place #1105, #1106
Fixed gh-1089 by improving dpctl.tensor.asarray handling of NumPy arrays viewing into host-accessible USM allocation objects.
MAINT: Fixed build break with newer GCC and SYCLOS #1118
Fixed a bug in basic indexing of dpctl.tensor.usm_ndarray #1136

Assets 4

28 Mar 13:54

oleksandr-pavlyk

0.14.2

6dc5479

v0.14.2

Added

Added dpctl.SyclDevice.partition_max_sub_devices property #1005
Added dpctl.program.SyclKernel.max_sub_group_size property #1028
Implemented printing of usm_ndarray #1013, #1043, #1060
Implemented support for advanced indexing for dpctl.tensor.usm_ndarray #1095, #1097, #1099, #1101
Implemented support for platform listing in dpctl.__main__ script #1014
Improved performance of dpctl.tensor.asnumpy #1026
Added UsmNDArray_Make* C-API for constructing dpctl.tensor.usm_ndarray from native allocations #1050, #1067
Added support for dpctl.SyclDevice.native_vector_width_* device descriptors #1075
Added dpctl::tensor::usm_ndarray::get_shape_vector and dpctl::tensor::usm_ndarray::get_strides_vector methods #1090

Changed

Removed dpctl.select_host_device, dpctl.has_host_device, dpctl.SyclDevice.is_host, and dpctl.SyclDevice.has_aspect_host since support for host device has been removed in DPC++ 2023 and from SYCL 2020 spec #1028
usm_ndarrayis made writable by default #1012, and writable flag is now checked by __setitem__.
Added convenience signature for C++ utility function in "dpctl4pybind11.hpp" #1016
Improved error reported when attempting to submit kernel that uses a data-type unsupported by target device #1018, #1040
Updated C++ code to require DPC++ 2023.0.0 or newer #1028, #1066
The dpctl.tensor.Device class supports print_device_info method #1029, equality comparison, and hashing #1048
Updated version of pybind11 used to 2.10.2 #1031
Improved internal utility responsible for reduction of iteration space dimensionality #1044, #1054
Changed return type of DCPCTLUSM_GetPointerType function in SyclInterface library #1061, #1065
Updated supported version of DLPack to 0.8 #1073
Implemented queue cache per context/device pair and deployed it in dpctl.memory, dpctl.tensor.from_dlpack and dpctl.tensor array creation functions #1076, #1079
Maintainance, CI work: #1001, #1009, #1011, #1024, #1030, #1032, #1035, #1037, #1039, #1041, #1045, #1047, #1055, #1057, #1059, #1068, #1070, #1074,#1077, #1078, #1081, #1084, #1085, #1088, #1086, #1092, #1093

Fixed

Fixed error gh-998 in forming Python exception, #999.
A small memory leak fixed, #1000
Improved dtype support in dpctl.tensor.full, PR #1002
Added missing header file #1008 fixing gh-1007
Fixed a typo in device-specific dtype mapping #1015
Fixed default device integer type to align with NumPy's behavior on Windows #1017
Fixed unexpected overflow in dpctl.tensor.linspace when one of the parameters is the largest floating point value #1034
Constructors dpctl.tensor.empty, dpctl.tensor.zeros, and usm_ndarray constructor itself no longer allow to create array with data-types not supported by targeted device #1042
Fixed parameter validation in dpctl.SyclQueue constructor #1052
Fixed usm_type of the resulting array in dpctl.tensor.tril and dpctl.tensor.triu functions #1062
Used DPC++ configuration files to ensure correct use of conda compiler toolchain on Linux #1072
Fixed issue with empty argument of dpctl.tensor.meshgrid function #1080
Fixed linking problem on Windows enabling dpctl to be functional on Windows for devices not supporting some data types #1083

Full Changelog: 0.14.0...0.14.2

Assets 4

19 Nov 05:10

oleksandr-pavlyk

0.14.0

21a6931

v0.14.0

[0.14.0] - 11/18/2022

Added

Implemented dpctl.tensor.linspace function from array-API #875.
Implemented dpctl.tensor.eye function from array-API #896.
Implemented dpctl.tensor.tril and dpctl.tensor.triu functions from array-API #910.
Added data type objects to dpctl.tensor namespace, finfo, iinfo, can_cast, and result_type functions #913.
Implemented dpctl.tensor.meshgrid creation function from array-API #920.
Implemented convenience class to represent output of dpctl.tensor.usm_ndarray.flags property #921.
Added new device attributes and kernel's device-specific attributes #894.
Added dpctl.utils.onetrace_enabled context manager for targeted trace collection #903.
Added support for stream keyword in __dlpack__ method, enabling support for sending usm_ndarray using mpi4py #906.
dpctl.tensor.asarray can now transition data between incompatible devices, #951.
Introduced "syclinterface/dpctl_sycl_types_casters.hpp" header file with declaration of conversion routines between SYCL type pointers and SyclInterface library opaque pointers #960.
Added C-API to dpctl.program.SyclKernel and dpctl.program.SyclProgram. Added type casters for new types to "dpctl4pybind11" and added an example demonstrating its use #970.
Introduced "dpctl/sycl.pxd" Cython declaration file to streamline use of SYCL functions from Cython, and added an example demonstrating its use #981.
Added experimental support for sharing data allocated on sub-devices via dlpack #984.
Added dpctl.SyclDevice.sub_group_sizes property to retrieve supported sizes of sub-group by the device #985.

Changed

Improved queue compatibility testing in dpctl.tensor's implementation module #900.
Added automatic measurement of array-API conformance test suite in CI #901.
Improved performance of array metadata transfer from host to device #912.
Used os.add_dll_directory on Windows to ensure that DPCTLSyclInterface library can be found #918.
Refactored dpctl.tensor's implementation module #941 to streamline adding new functionality. Streamlined dpctl::tensor::usm_ndarray class implementation.
Added debugging messaging in case when DPCTLDynamicLib::getSymbol encounters errors #956.
Updated code base according to changes in DPC++ compiler #952, #957, #958.
Changed dpctl to use pybind11 2.10.1 #967.
Extended dpctl.tensor.full to accept 0d and higher dimensional arrays for fill-value parameter #982 and #995.

Fixed

Improved SyclDevice constructor error message #893.
Fixed issue gh-890 about dpctl.tensor.reshape function #915.
Fixed unexpected UnboundLocalError exception in #922.
Fixed bugs in dpctl.tensor.arange in #945.
Fixed issue with type inferencing in dpctl.tensor.asarray in #949.
Added missing docstrings for dpctl.SyclDevice properties #964.

Assets 6

28 Jul 21:30

oleksandr-pavlyk

0.13.0

5004aa1

v0.13.0

Added

Implemented and deployed dedicated kernels for copying with casting #781, used in __setitem__, implementaion of asarray, dpctl.tensor.copy functions.
Implemented dedicated copying kernel for dpctl.tensor.reshape function #810, added support for copy keyword #807.
Implemented dedicated kernel to copy with casting from numpy.ndarray into dpctl.tensor.usm_ndarray #817.
Implemented dpctl.tensor.permute_dims function from array-API #787.
Implemented dpctl.tensor.expand_dims function from array-API #788.
Implemented dpctl.tensor.squeeze function from array-API #790.
Implemented dpctl.tensor.broadcast_to function from array-API #791.
Implemented dpctl.tensor.broadcast_arrays function from array-API #798.
Implemented dpctl.tensor.flip function from array-API #801.
Implemented dpctl.tensor.usm_ndarray.mT property per array-API #805.
Implemented dpctl.tensor.roll function from array-API #809.
Implemented dpctl.tensor.arange function from array-API #814.
Implemented dpctl.tensor.zeros function from array-API #816.
Implemented dpctl.tensor.zeros function from array-API #816.
Implemented dpctl.tensor.ones, dpctl.tensor.full, dpctl.tensor.empty_like, dpctl.tensor.zeros_like, dpctl.tensor.ones_like, dpctl.tensor.full_like functions from array-API #822.
Implemented DPCTLQueue_Memset function in SyclInterface library #812, and exposed it for dpctl.memory.MemoryUSM* classes #815.
Implemented dpctl.utils.get_coerced_usm_type to deduced usm type of the output array from types of input arrays in compute-follows-data execution model #797.
Added dpctl.SyclDevice.profiling_timer_resolution property #825.
Added dpctl.SyclDevice.platform and dpctl.SyclPlatform.default_context properties #827.
Provided pybind11 example for functions working on dpctl.tensor.usm_ndarray container applying oneMKL functions #780, #793, #819. The example was expanded to demonstrate implementing iterative linear solvers (Chebyshev solver, and Conjugate-Gradient solver) by asynchronously submitting individual SYCL kernels from Python #821, #833, #838.
Wrote manual page about working with dpctl.SyclQueue #829.
Added cmake scripts to dpctl package layout and a way to query the location #853.
Implemented dpctl.tensor.concat function from array-API #867.
Implemented dpctl.tensor.stack function from array-API #872.

Changed

Enhanced coverage collection for SyclInterface library by also collecting it during pytest run and combining traces with those collected during C-test run #818. This change also allows to not rebuild SyclInterface library when building C-test executable.
Exported keep_args_alive utility in dpctl4pybind11.hpp header #820. The utility uses sycl::handler::host_task to keep given Python arguments alive until eac sycl::event from the given vector of events is complete. The host task is scheduled on the SYCL queue provided as the first argument.
Changed the size of struct underlying dpctl.SyclEvent to avoid storing Python object previously used to keep kernel arguments scheduled with dpctl.SyclQueue.submit #823.
Fixed docstring for dpctl.SyclTimer #824.
Changed type of exceptions raised on failure to create dpctl.SyclDevice from ValueError to dpctl.SyclDeviceCreationError #826.
Improved performance of pybind11 type casters #837.
Changed implementation of dpctl.SyclProgram from using deprecated sycl::program to sycl::kernel_bundle #845.
Removed deprecated device aspects, added new supported aspects #844.
Updated vendored dlpack.h to version 0.7 #847.

Fixed

Fixed dpctl.lsplatform() to work correctly when used from within Jupyter notebook #800.
Fixed script to drive debug build #835 and fixed code to compile in debug mode #836.
Fixed filter selector string produced in outputs of dpctl.lsplatform(verbosity=2) and dpctl.SyclDevice.print_device_info #866.
Fixed issue with slicing reported in gh-870 in #871.

New contributor: @npolina4 contributed #867, #872 and reported #870

Contributors

npolina4

Assets 2

01 Mar 23:54

xaleryb

0.12.0

9815a7e

v0.12.0

What's changed in 0.12.0

Added

Properties added to MemoryUSM* objects. #647
Added dpctl.tensor.asarray #646
Implemented DLPack support for usm_ndarray #682
Exported dpctl.tensor.Device class #708 #718
Added testing of examples in CI #722
Added user manuals to dpctl documentation #712 #773

Changed

Folder dpctl-capi/ renamed to libsyclinterface/ in sources and documentation. #666
#768
Added workflow to publish rendered documentation on PRs #673 #753 #726
Synchronization functions and USM allocation functions release GIL #736 #766
dpctl.SyclEvent destructor is made non-blocking #751

Fixed

Fixed for issue in code of dpctl.tensor.usm_ndarray.T #653
Fixed issue with dpctl.tensor.reshape's affect on contiguity flags of usm_ndarray #695
Fixed handling of empty list by dpctl.tensor.asarray #694
Fixed type inference with array of empty arrays in dpctl.tensor.asarray #697
Fixed issue gh-698 with dpctl.tensr.asarray #709
Fixed performance of item assignment from numpy array #724
DPCTLDeviceMgr_GetNumDevices should not operate on rejected devices #737
Fixed issue gh-729 for dpctl.tensor.reshape applied to 0-element usm_ndarray #756
Fixed issue gh-728 with dpctl.tensor.astype #757
Fixed type in memory overlapping test #770
Fixed issue with operator.pos for dpctl.tensor.usm_ndarray #783
Only call PyThread_Ensure from host_task if the main-thread interpreter is initialized and not finalizing #776 #778 #721

Full Changelog: 0.11.4...0.12.0

Assets 2

03 Dec 09:01

PokhodenkoSA

0.11.4

47a2684

0.11.4

What's Changed

Fix tests for nested context factories expecting for integration environment by @PokhodenkoSA in #705

Full Changelog: 0.11.3...0.11.4

Contributors

PokhodenkoSA

Assets 2

01 Dec 09:10

PokhodenkoSA

0.11.3

4cf2b20

0.11.3

Fixed

Set the last byte in allocated char array to zero [cherry picked from #650] (#699)

Full Changelog: 0.11.2...0.11.3

Assets 2

29 Nov 17:05

PokhodenkoSA

0.11.2

8f21b44

0.11.2

Added

Extending dpctl.device_context with nested contexts (#678)

Fixed

Fixed issue #649 about incorrect behavior of .T method on sliced arrays (#653)

Full Changelog: 0.11.1...0.11.2

Assets 2

13 Nov 01:23

diptorupd

0.11.1

3a5e53b

0.11.1

Changed

Replaced uses of clang compiler with icx executable (#665)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added

Changed

Fixed

Added

Changed

Fixed

Added

Changed

Fixed

[0.14.0] - 11/18/2022

Added

Changed

Fixed

Added

Changed

Fixed

Contributors

What's changed in 0.12.0

Added

Changed

Fixed

What's Changed

Contributors

Fixed

Added

Fixed

Changed

Releases: IntelPython/dpctl

v0.14.4

Added

Changed

Fixed

v0.14.3

Added

Changed

Fixed

v0.14.2

Added

Changed

Fixed

v0.14.0

[0.14.0] - 11/18/2022

Added

Changed

Fixed

v0.13.0

Added

Changed

Fixed

Contributors

v0.12.0

What's changed in 0.12.0

Added

Changed

Fixed

0.11.4

What's Changed

Contributors

0.11.3

Fixed

0.11.2

Added

Fixed

0.11.1

Changed