Releases: pathwaycom/pathway
Releases · pathwaycom/pathway
v0.10.1
Added
query
method to VectorStoreServer to enable compatible API withDataIndex
.AdaptiveRAGQuestionAnswerer
to xpacks.question_answering. End-to-end pipeline and accompanying code forPrivate RAG
showcase.
v0.10.0
Added
- Pathway now warns when unintentionally creating Table with empty universe.
pw.io.kafka.write
inraw
andplaintext
formats now supports output for tables with multiple columns. For such tables, it requires the specification of the column that must be used as a value of the produced Kafka messages and gives a possibility to provide column which must be used as a key.pw.io.kafka.write
can now output values from the table using Kafka message headers in 'raw' and 'plaintext' output format.
Changed
instance
arguments togroupby
,join
,with_id_from
now determine how entries are distributed between machines.flatten
results remain on the same machine as their source entries.join
sends each record between machines at most once.- BREAKING:
flatten
,join
,groupby
(if used withinstance
),with_id_from
(if used withinstance
) generate IDs of the produced rows differently than in the previous versions. pathway spawn
with multiple workers prints only output from the first worker.
v0.9.0
Added
pw.reducers.latest
andpw.reducers.earliest
that return the value with respectively maximal and minimal processing time assigned.pw.io.kafka.write
can now produce messages containing raw bytes in case the table consists of a single binary column andraw
mode is specified. Similarly, this method will provide plaintext messages ifplaintext
mode is chosen and the table consists of a single string-typed column.pw.io.pubsub.write
connector for publishing Pathway tables into Google PubSub.- Argument
strict_prompt
toanswer_with_geometric_rag_strategy
andanswer_with_geometric_rag_strategy_from_index
that allows optimizing prompts for smaller open-source LLM models. - Temporarily switch LiteLLMChat's generation method to sync version due to a bug while using
json
mode with Ollama.
Changed
- BREAKING:
pw.io.kafka.read
will not parse the messages from UTF-8 in caseraw
mode was specified. To preserve this behavior you can use theplaintext
mode. - BREAKING:
Table.flatten
now flattens one column and spreads every other column of the table, instead of taking other columns from the argument list.
v0.8.6
Added
pw.io.bigquery.write
connector for writing Pathway tables into Google BigQuery.- parameter
filepath_globpattern
toquery
method inVectorStoreClient
for specifying which files should be considered in the query. - Improved compatibility of
pw.Json
with standard methods such aslen()
,int()
,float()
,bool()
,iter()
,reversed()
when feasible.
Changed
pw.io.postgres.write
can now parallelize writes to several threads if several workers are configured.- Pathway now checks types of pointers rigorously. Indexing table with mismatched number/types of columns vs what was used to create index will now result in a TypeError.
pw.Json.as_float()
method now supports integer JSON values.
v0.8.5
Added
- New function
answer_with_geometric_rag_strategy_from_index
, which allows to useanswer_with_geometric_rag_strategy
without the need to first retrieve documents from index. - Added support for custom state serialization to
udf_reducer
. - Introduced
instance
parameter inAsyncTransformer
. All calls with a given(instance, processing_time)
pair are returned at the same processing time. Ordering is preserved within a single instance. - Added
successful
,failed
,finished
properties toAsyncTransformer
. They return tables with successful calls, failed calls and all finished calls, respectively.
Changed
- Property
result
ofAsyncTransformer
is deprecated. Propertysuccessful
should be used instead. pw.io.csv.read
,pw.io.jsonlines.read
,pw.io.fs.read
,pw.io.plaintext.read
now handlepath
as a glob pattern and read all matched files and directories recursively.
v0.8.4
Fixed
- Pathway will only require
LiteLLM
package, if you use one of the wrappers forLiteLLM
. - Retries are implemented in
pw.io.airbyte.read
. - State processing protocol is updated in
pw.io.airbyte.read
.
v0.8.3
Added
- New parameters of
pw.UDF
class andpw.udf
decorator:return_type
,deterministic
,propagate_none
,executor
,cache_strategy
. - The LLM Xpack now provides integrations with LlamaIndex and LangChain for running the Pathway VectorStore server.
Changed
- Subclassing
UDFSync
andUDFAsync
is deprecated.UDF
should be subclassed to create a new UDF. - Passing keyword arguments to
pw.apply
,pw.apply_with_type
,pw.apply_async
is deprecated. In the future, they'll be used for configuration, not passing data to the function.
Fixed
- Fixed a minor bug with
Table.groupby()
method which sometimes prevented of accessing certain columns in the followingreduce()
. - Fixed warnings from using OpenAI Async embedding model in the VectorStore in Colab.
v0.8.2
Added
%:z
timezone format code tostrptime
.- Support for Airbyte connectors
pw.io.airbyte
.
v0.8.1
Added
- Introduced the
send_alerts
function in thepw.io.slack
namespace, enabling users to send messages from a specified column directly to a Slack channel. - Enhanced the
pw.io.http.rest_connector
by introducing an additional argument calledrequest_validator
. This feature empowers users to validate payloads and raise anHTTP 400
error if necessary.
Fixed
- Addressed an issue in
pw.io.xpacks.llm.VectorStoreServer
where the computation of the last modification timestamp for an indexed document was incorrect.
Changed
- Improved the behavior of
pw.io.kafka.write
. It now includes retries when sending data to the output topic encounters failures.
v0.8.0
Added
pw.io.http.rest_connector
now supports multiple HTTP request types.pw.io.http.PathwayWebserver
now allows Cross-Origin Resource Sharing (CORS) to be enabled on newly added endpoints- Wrappers for LiteLLM and HuggingFace chat services and SentenceTransformers embedding service are now added to Pathway xpack for LLMs.
Changed
pw.run
now includes an additional parameterruntime_typechecking
that enables strict type checking at runtime.- Embedders in pathway.xpacks.llm.embedders now correctly process empty strings as queries.
- BREAKING:
pw.run
andpw.run_all
now only accept keyword arguments.
Fixed
pw.Duration
can now be returned from User-Defined Functions (UDFs) or used as a constant value without resulting in errors.pw.io.debezium.read
now correctly handles tables that do not have a primary key.