-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve read_numeric()
to vastly increase parse()
performance for all tags
#150
Comments
read_numeric()
to vastly increase parse()
performance for all tags
I just noticed that pre-made But, still, are improvements to this crucial function welcome? |
At runtime, |
That's why I created the benchmarks with other NBT implementations... no point doing experiments if I can't accurately measure the gains. And little point trying to improve what is already pretty damn good. My initial assumption that it was slow and could be "vastly improved" turned out to be wrong. But still, one experiment I might try is to use an (attribute?) assignment once per The point is that there is little point allowing endianess to be set on a per-tag basis. Either the whole file is little endian or big endian, so we can take advantage of this assumption. Humm, perhaps Benchmarks. We need benchmarks. Or skip all of that and go Cython. Please! |
An interesting optimization approach taken by Minecraft: it caches all 256 possible |
When doing some profiling loading NBT files, trying to optimize loading times,
read_numeric()
stands at the top by a large margin. Taking a closer look at it, it seems this is the culprit:And that is universally used in all tag classes using a similar pattern:
The problem is:
read_numeric
creates a newStruct
instance on every read. That is a very expensive operation. There should probably be a way to pre-build (or cache) such instances, so eitherread_numeric
orget_format
or evenBYTE/INT...
contain/return the same struct instances, while still keeping the ability to selectbyteorder
on a per-call basis.I can submit a PR to fix this, and I'm sure reading (and writing) times will vastly improve. I'll do so in a way it does not change the API of any of the tag classes (i.e, keep
Compound.parse(cls, fileobj, byteorder="big")
signature for all write/parse of all tags), and possibly keepread_numeric()
signature too (so no changes to the Tag classes at all), but most likelyget_format()
will change signature and/or internal structure, and the underlyingBYTES/INT/...
will most likely change their internal values, but I'll do my best to keep them still byteorder-agnostic constants .Is such improvement welcome?
The text was updated successfully, but these errors were encountered: