-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dictionary intersection #490
base: master
Are you sure you want to change the base?
Conversation
itertoolz.get will always return an iterable if passed a list unlike itemgetter.
This seems like a useful and intuitive function to have. With the addition of https://www.python.org/dev/peps/pep-0584/ in Python 3.9, is this still useful? Perhaps. Advantages:
I'm curious how efficient we can do this in Cython. |
oops, this is intersect |
Regarding how efficiently this can be done in cython, it breaks down to how efficiently 1) an intersection of the keys can be computed and 2) pulling each of those common keys from each dictionary. I don't think it could be more efficient than O(nm) where n is the number of dictionaries and m is the number of common keys. This is assuming that inserting and retrieving from the mapping is O(1). |
dicts = dicts[0] | ||
factory = _get_factory(merge, kwargs) | ||
|
||
dict_keys = map(operator.methodcaller('keys'), sorted(dicts, key=len)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need to test without the sort. Sorting might be slowing things down here.
I wonder if this functionality would be better added to |
would that look something like |
An efficient way to intersect dictionaries based on their keys. The motivating case was to replace the below line with something nicer and more general.
This function is generalized to compute the intersection of more than two dictionaries and handle generic mappings.
@eriknw, I think other operations on dictionary views would be useful. However, those other operations don't seem to fit the calling conventions for the functions here because they are only well-defined for two dictionaries (operations like difference, symmetric difference, etc). Union would be similar to merge, but preserves all the values from each input dictionary.