Skip to content

Releases: pathwaycom/pathway

v0.10.1

30 Apr 12:25
Compare
Choose a tag to compare

Added

  • query method to VectorStoreServer to enable compatible API with DataIndex.
  • AdaptiveRAGQuestionAnswerer to xpacks.question_answering. End-to-end pipeline and accompanying code for Private RAG showcase.

v0.10.0

24 Apr 22:21
Compare
Choose a tag to compare

Added

  • Pathway now warns when unintentionally creating Table with empty universe.
  • pw.io.kafka.write in raw and plaintext formats now supports output for tables with multiple columns. For such tables, it requires the specification of the column that must be used as a value of the produced Kafka messages and gives a possibility to provide column which must be used as a key.
  • pw.io.kafka.write can now output values from the table using Kafka message headers in 'raw' and 'plaintext' output format.

Changed

  • instance arguments to groupby, join, with_id_from now determine how entries are distributed between machines.
  • flatten results remain on the same machine as their source entries.
  • join sends each record between machines at most once.
  • BREAKING: flatten, join, groupby (if used with instance), with_id_from (if used with instance) generate IDs of the produced rows differently than in the previous versions.
  • pathway spawn with multiple workers prints only output from the first worker.

v0.9.0

18 Apr 21:01
Compare
Choose a tag to compare

Added

  • pw.reducers.latest and pw.reducers.earliest that return the value with respectively maximal and minimal processing time assigned.
  • pw.io.kafka.write can now produce messages containing raw bytes in case the table consists of a single binary column and raw mode is specified. Similarly, this method will provide plaintext messages if plaintext mode is chosen and the table consists of a single string-typed column.
  • pw.io.pubsub.write connector for publishing Pathway tables into Google PubSub.
  • Argument strict_prompt to answer_with_geometric_rag_strategy and answer_with_geometric_rag_strategy_from_index that allows optimizing prompts for smaller open-source LLM models.
  • Temporarily switch LiteLLMChat's generation method to sync version due to a bug while using json mode with Ollama.

Changed

  • BREAKING: pw.io.kafka.read will not parse the messages from UTF-8 in case raw mode was specified. To preserve this behavior you can use the plaintext mode.
  • BREAKING: Table.flatten now flattens one column and spreads every other column of the table, instead of taking other columns from the argument list.

v0.8.6

10 Apr 20:16
Compare
Choose a tag to compare

Added

  • pw.io.bigquery.write connector for writing Pathway tables into Google BigQuery.
  • parameter filepath_globpattern to query method in VectorStoreClient for specifying which files should be considered in the query.
  • Improved compatibility of pw.Json with standard methods such as len(), int(), float(), bool(), iter(), reversed() when feasible.

Changed

  • pw.io.postgres.write can now parallelize writes to several threads if several workers are configured.
  • Pathway now checks types of pointers rigorously. Indexing table with mismatched number/types of columns vs what was used to create index will now result in a TypeError.
  • pw.Json.as_float() method now supports integer JSON values.

v0.8.5

27 Mar 22:03
Compare
Choose a tag to compare

Added

  • New function answer_with_geometric_rag_strategy_from_index, which allows to use answer_with_geometric_rag_strategy without the need to first retrieve documents from index.
  • Added support for custom state serialization to udf_reducer.
  • Introduced instance parameter in AsyncTransformer. All calls with a given (instance, processing_time) pair are returned at the same processing time. Ordering is preserved within a single instance.
  • Added successful, failed, finished properties to AsyncTransformer. They return tables with successful calls, failed calls and all finished calls, respectively.

Changed

  • Property result of AsyncTransformer is deprecated. Property successful should be used instead.
  • pw.io.csv.read, pw.io.jsonlines.read, pw.io.fs.read, pw.io.plaintext.read now handle path as a glob pattern and read all matched files and directories recursively.

v0.8.4

18 Mar 17:52
Compare
Choose a tag to compare

Fixed

  • Pathway will only require LiteLLM package, if you use one of the wrappers for LiteLLM.
  • Retries are implemented in pw.io.airbyte.read.
  • State processing protocol is updated in pw.io.airbyte.read.

v0.8.3

13 Mar 21:17
Compare
Choose a tag to compare

Added

  • New parameters of pw.UDF class and pw.udf decorator: return_type, deterministic, propagate_none, executor, cache_strategy.
  • The LLM Xpack now provides integrations with LlamaIndex and LangChain for running the Pathway VectorStore server.

Changed

  • Subclassing UDFSync and UDFAsync is deprecated. UDF should be subclassed to create a new UDF.
  • Passing keyword arguments to pw.apply, pw.apply_with_type, pw.apply_async is deprecated. In the future, they'll be used for configuration, not passing data to the function.

Fixed

  • Fixed a minor bug with Table.groupby() method which sometimes prevented of accessing certain columns in the following reduce().
  • Fixed warnings from using OpenAI Async embedding model in the VectorStore in Colab.

v0.8.2

28 Feb 12:56
Compare
Choose a tag to compare

Added

  • %:z timezone format code to strptime.
  • Support for Airbyte connectors pw.io.airbyte.

v0.8.1

15 Feb 13:42
Compare
Choose a tag to compare

Added

  • Introduced the send_alerts function in the pw.io.slack namespace, enabling users to send messages from a specified column directly to a Slack channel.
  • Enhanced the pw.io.http.rest_connector by introducing an additional argument called request_validator. This feature empowers users to validate payloads and raise an HTTP 400 error if necessary.

Fixed

  • Addressed an issue in pw.io.xpacks.llm.VectorStoreServer where the computation of the last modification timestamp for an indexed document was incorrect.

Changed

  • Improved the behavior of pw.io.kafka.write. It now includes retries when sending data to the output topic encounters failures.

v0.8.0

01 Feb 14:51
Compare
Choose a tag to compare

Added

  • pw.io.http.rest_connector now supports multiple HTTP request types.
  • pw.io.http.PathwayWebserver now allows Cross-Origin Resource Sharing (CORS) to be enabled on newly added endpoints
  • Wrappers for LiteLLM and HuggingFace chat services and SentenceTransformers embedding service are now added to Pathway xpack for LLMs.

Changed

  • pw.run now includes an additional parameter runtime_typechecking that enables strict type checking at runtime.
  • Embedders in pathway.xpacks.llm.embedders now correctly process empty strings as queries.
  • BREAKING: pw.run and pw.run_all now only accept keyword arguments.

Fixed

  • pw.Duration can now be returned from User-Defined Functions (UDFs) or used as a constant value without resulting in errors.
  • pw.io.debezium.read now correctly handles tables that do not have a primary key.