-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] Add support for NumPy 2.0, test against nightly versions of dependencies (fixes #6454) #6467
Conversation
@@ -20,6 +20,10 @@ | |||
else: | |||
import pyarrow as pa # type: ignore | |||
|
|||
assert ( | |||
lgb.compat.PYARROW_INSTALLED is True | |||
), "'pyarrow' and its dependencies must be installed to run the arrow tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found myself in a situation where pyarrow
was installed but cffi
was not.
That caused this import to fail:
LightGBM/python-package/lightgbm/compat.py
Line 212 in e0cda88
from pyarrow.cffi import ffi as arrow_cffi |
which led to all of the pyarrow
imports being defined with the placeholders like this:
LightGBM/python-package/lightgbm/compat.py
Lines 227 to 231 in e0cda88
class pa_ChunkedArray: # type: ignore | |
"""Dummy class for pa.ChunkedArray.""" | |
def __init__(self, *args: Any, **kwargs: Any): | |
pass |
which led to every Arrow test failing in a hard-to-understand way.
pytest tests/python_package_test/test_arrow.py
==== short test summary info ====
FAILED tests/python_package_test/test_arrow.py::test_dataset_construct_fuzzy[<lambda>-dataset_params0] - AssertionError: assert False
...
FAILED tests/python_package_test/test_arrow.py::test_predict_ranking - TypeError: Wrong type(ChunkedArray) for label.
==== 139 failed in 11.77s ====
As of this change, it'll fail with a slightly clearer error message (and earlier, without running all the tests).
E AssertionError: 'pyarrow' and its dependencies must be installed to run the arrow tests
@@ -89,6 +116,7 @@ jobs: | |||
run: | | |||
docker run \ | |||
--rm \ | |||
--env CMAKE_BUILD_PARALLEL_LEVEL=${{ env.CMAKE_BUILD_PARALLEL_LEVEL }} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one that I missed in #6458, noticed it when replicating some of the configuration for the new CI job here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Thanks very much for the reviews! I'm very excited to now have this CI job testing against nightlies of |
Fixes #6454.
This PR makes
lightgbm
compatible withnumpy
2.xnp.array(..., copy=False)
calls (which are no longer supported in NumPy 2.x) withnp.asarray(...)
.lightgbm
against nightlies ofmatplotlib
,numpy
,pandas
,pyarrow
,scipy
, andscikit-learn
Also fixes some other smalls things I noticed thanks to the new CI job.
test_arrow.py
tests are failing becausepyarrow
is installed butcffi
is notpandas>=3.0
(which is not release yet), to deal with a change topd.DataFrame(...)
's copying behavior when built from anumpy
arrayNotes for Reviewers
What does it mean that
np.array(..., copy=False)
is "no longer supported"?Raises an exception if you try to ask it to do an impossible cast.
So
lightgbm
is going to start creating unnecessary copies again onpandas
input, if you usenumpy
2.0`?No.
pandas is going to start creating copies on NumPy input under some conditions. Follow the conversation at pandas-dev/pandas#58913.
cc @jmoralez, since you did so much work on this e.g. in #4927 and #5612