Replies: 1 comment 1 reply
-
Hi Frank! Thank you for reporting this. You are right that there are several areas where the documentation is sparse and confusing. I apologize. Regarding the sort, it would be wonderful if you opened a feature request. On the docs suggestion, if you'd like to open a PR and contribute it yourself, we would be honored to have your contribution. You can find it in paradedb/paradedb under the docs/ folder. If you prefer not to, I'll take care to add your feedback to the documentation. Please let us know how else we can help, and thank you for your kind words :) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I found ParadeDB with "pg_search" and "pg_lakehouse" by random and I am thrilled. You really fill a big gap in the PostgreSQL universe. It makes a database for everything more realistic and feasible. Thanks a lot for that.
I have been working with "pg_search" for the last 24 hours and have integrated it into our website database. The performance is really very good, much faster than the standard Postgresql full-text search. I failed the first time in some places because of the documentation, in my opinion a few more examples are missing.
RANGE
For example, the documentation on range is very poor (and also has an error "range => '[1,3)'::int4range"). But after some trial and error, I got it to work.
Here are some suggestions for additions to the documentation (our index: search_content):
tsrange:
daterange:
Combinations:
SORT
The default sorting for "pg_search" is always BM25 scoring. In general, relevance is of course the first and most important sort. However, our users have expressed a desire for further sorting, e.g. by date. This means showing all content with the search terms, but then by date and not by relevance.
This does not work optimally at the moment because there is no additional sorting field. Even if the TIMESTAMP field e.g. "stamp" is available in the index, I cannot use it for sorting. This means that I have to search the entire index first in order to re-sort with ORDER BY. The performance is not optimal for many sessions.
Maybe I missed something, I would be grateful for a hint. Alternatively I can open a new feature request.
INDEX: numeric_fields arrays
INT[] and BIGINT[] arrays do not yet work for numeric_fields. It is not possible to filter INT or BIGINT arrays. New feature request?
TOKENIZERS
It is not possible to create your own mixed tokenizers or to apply multiple tokenizers to a field. For example: A tokenizer that supports "ngrams" and "en_stem". This may lead to better search results. There is already a issue for this: #575. Any alternative hint?
STEMMER
Support for multiple languages. There is already a issue for this: #1062
FOUND COUNTER
During a search, there is always a counter with the number of records found. Even if you use limit or offset. I have not found a counter here yet. Do you have any tips on how to get it without a second query?
Finally, the only remaining question is: How big is my index? There is already an issue for that: #1061
Thanks and keep up the good work. We will definitely test the search and integrate it in the future.
Regards
Frank from Germany
Beta Was this translation helpful? Give feedback.
All reactions