Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for using DataConnectionService [SUP-523] #1361

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/antora.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,5 +45,6 @@ asciidoc:
hazelcast-cloud: Cloud
ucn: User Code Namespaces
ucd: User Code Deployment
minimum-java-version: 17
nav:
- modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
= Using Data Connections in custom components
:description: Using the Data Connection Service gives access to the configured xref:data-connections-configuration.adoc[data connections] in custom components.

{description}

== Typical Usage

The typical steps to use a data connection are as follows:

1. Obtain the data connection from the data connection service.
2. Retrieve the underlying resource from the `DataConnection` instance. This step varies based on the specific implementation of `DataConnection` (e.g., `JdbcDataConnection` provides `getConnection()` which returns a `java.sql.Connection`; `HazelcastDataConnection` provides `getClient()` which returns a `HazelcastInstance`).
3. Use the resource to perform the required operations.
4. Dispose of the resource (e.g., by calling `Connection#close` or `HazelcastInstance#destroy`).
5. Release the `DataConnection` instance (by calling `DataConnection#release()`).

Steps 2, 3, and 4 should be completed as quickly as possible to maximize the efficiency of connection pooling.

[source,java]
----
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved
JdbcDataConnetion jdbcDataConnection = instance.getDataConnectionService()
.getAndRetainDataConnection("my_data_connection", JdbcDataConnection.class); <1>

try (Connection connection = jdbcDataConnection.getConnection()) { <2>
// ... work with connection <3>

// try-with-resources statement closes the connection <4>
} catch (SQLException e) {
throw new RuntimeException("Failed to load value for key=" + key, e);
}

jdbcDataConnection.release(); <4>
----

== Retrieve Data Connection Service
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved

Before working with data connections you need to retrieve an instance of the `DataConnectionService`. Use
https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/core/HazelcastInstance.html#getDataConnectionService()[`HazelcastInstance#getDataConnectionService()`]
to obtain an instance of `DataConnectionService`.

You can implement HazelcastInstanceAware in listeners, entry processors, tasks etc. to get access
to the `HazelcastInstance`.

In the pipeline API you can use
https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/jet/core/ProcessorMetaSupplier.Context.html#dataConnectionService()[ProcessorMetaSupplier.Context#dataConnectionService()].

NOTE: The Data Connection Service is only available on the member side. Calling `getDataConnectionService()` on client will result in `UnsupportedOperationException`.

== Retrieve Configured DataConnection

Use the `DataConnectionService` to get an instance of previously configured data connection https://docs.hazelcast.org/docs/{full-version}/javadoc/com/hazelcast/dataconnection/DataConnectionService.html#getAndRetainDataConnection(java.lang.String,java.lang.Class)[DataConnectionService#getAndRetainDataConnection(String, Class)]. For details how to configure a data connection, please refer
to the xref:data-connections-configuration.adoc[Configuring Data Connections] page.

== Data Connection Scope

The data connection configuration is per-member. For example, when a data connection is created
with maximum pool size of 10 and the cluster has 3 members, there will be up to 30 connections
created.

== Data Connection Sharing

Data connection is shared by default. It means that when the data connection is requested in multiple places, the same
underlying resource (e.g. Jdbc pool, remote client) is used.
If you want to share the data connection configuration, but use a different instance of the underlying resource,
set the `DataConnectionConfig#setShared` to false.
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved

== Configuration Considerations

If the data connection is defined in the Hazelcast configuration, it remains immutable for the entire lifespan of the Hazelcast member. In this case, whether you retrieve the DataConnection instance once or each time before accessing the underlying resource, the result will be the same.

However, if the data connection is created dynamically via SQL, it can be replaced using `CREATE OR REPLACE DATA CONNECTION`
(see xref:sql.adoc).
In such cases, the DataConnection instance will stay valid until you release it, allowing you to retrieve the underlying resource as needed. This approach can be useful for adapting to changes in data connection configuration.

For example, if you are running a batch job and want to use the same data connection throughout, request the connection at the start of the job. For a streaming job that may need updated configurations, retrieve both the data connection and the underlying resource just before use (e.g., when processing each item in the pipeline).
Original file line number Diff line number Diff line change
@@ -0,0 +1,278 @@
= Map Loader using Data Connection

:description: In this tutorial you build a custom map loader that uses a configured data connection to load data not present in an IMap.

{description}

NOTE: This tutorial builds a custom implementation of MapLoader. For the most common use cases we also provide an out-of-the-box implementation xref:mapstore:configuring-a-generic-maploader.adoc[GenericMapLoader].

== Before you begin

To complete this tutorial, you need the following:

[cols="1a,1a"]
|===
|Prerequisites|Useful resources

|Java {minimum-java-version} or newer
|
|Maven or Gradle
| https://maven.apache.org/install.html or https://gradle.org/install/
|Docker
|https://docs.docker.com/get-started/[Get Started on docker.com]

|===

=== Step 1. Create and Populate the Database

This tutorial uses Docker to run the Postgres database.

Run the following command to start Postgres:

[source, bash]
----
docker run --name postgres --rm -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres
----

Start `psql` client:

[source, bash]
----
docker exec -it postgres psql -U postgres
----

Create a table `my_table` and populate it with data:

[source,sql]
----
CREATE TABLE my_table(id INTEGER PRIMARY KEY, value VARCHAR(128));

INSERT INTO my_table VALUES (0, 'zero');
INSERT INTO my_table VALUES (1, 'one');
INSERT INTO my_table VALUES (2, 'two');
INSERT INTO my_table VALUES (3, 'three');
INSERT INTO my_table VALUES (4, 'four');
INSERT INTO my_table VALUES (5, 'five');
INSERT INTO my_table VALUES (6, 'six');
INSERT INTO my_table VALUES (7, 'seven');
INSERT INTO my_table VALUES (8, 'eight');
INSERT INTO my_table VALUES (9, 'nine');
----

== Step 2. Create a New Java Project

Create a blank Java project named pipeline-service-data-connection-example and copy the Gradle or Maven file into it:

[source,xml]
----
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>org.example</groupId>
<artifactId>maploader-data-connection-example</artifactId>
<version>1.0-SNAPSHOT</version>

<name>maploader-data-connection-example</name>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.release>17</maven.compiler.release>
</properties>

<dependencies>
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast</artifactId>
<version>6.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.24.1</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j2-impl</artifactId>
<version>2.24.1</version>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.7.4</version>
</dependency>
</dependencies>
</project>
----

== Step 3. MapLoader

The following map loader implements the `com.hazelcast.map.MapLoader` and `com.hazelcast.map.MapLoaderLifecycleSupport`
interfaces.

[source,java]
----
public class SimpleMapLoader implements MapLoader<Integer, String>, MapLoaderLifecycleSupport {

private JdbcDataConnection jdbcDataConnection;

// ...
}
----

To implement the `MapLoaderLifecycleSupport` interface we need the following methods:

[source,java]
----
// ...

@Override
public void init(HazelcastInstance instance, Properties properties, String mapName) {
jdbcDataConnection = instance.getDataConnectionService()
.getAndRetainDataConnection("my_data_connection", JdbcDataConnection.class);
}

@Override
public void destroy() {
jdbcDataConnection.release();
}

// ...
----

To implement the `MapLoader` interface we need the following methods:

[source,java]
----
@Override
public String load(Integer key) {
try (Connection connection = jdbcDataConnection.getConnection();
PreparedStatement statement = connection.prepareStatement("SELECT value FROM my_table WHERE id = ?")) {

statement.setInt(1, key);
ResultSet resultSet = statement.executeQuery();
String value = null;
if (resultSet.next()) {
value = resultSet.getString("value");
}
return value;
} catch (SQLException e) {
throw new RuntimeException("Failed to load value for key=" + key, e);
}
}

@Override
public Map<Integer, String> loadAll(Collection<Integer> keys) {
Map<Integer, String> resultMap = new HashMap<>();
StringBuilder queryBuilder = new StringBuilder("SELECT id, value FROM my_table WHERE id IN (");

// Construct query for batch retrieval
keys.forEach(key -> queryBuilder.append("?,"));
queryBuilder.setLength(queryBuilder.length() - 1); // Remove last comma
queryBuilder.append(")");

try (Connection connection = jdbcDataConnection.getConnection();
PreparedStatement statement = connection.prepareStatement(queryBuilder.toString())) {

int index = 1;
for (Integer key : keys) {
statement.setInt(index++, key);
}

ResultSet resultSet = statement.executeQuery();
while (resultSet.next()) {
resultMap.put(resultSet.getInt("id"), resultSet.getString("value"));
}
return resultMap;
} catch (SQLException e) {
throw new RuntimeException("Failed to load values", e);
}
}

@Override
public Iterable<Integer> loadAllKeys() {
List<Integer> keys = new ArrayList<>();
try (Connection connection = jdbcDataConnection.getConnection();
PreparedStatement statement = connection.prepareStatement("SELECT id FROM my_table");
ResultSet resultSet = statement.executeQuery()) {

while (resultSet.next()) {
keys.add(resultSet.getInt("id"));
}
return keys;
} catch (Exception e) {
throw new RuntimeException("Failed to load all keys", e);
}
}
----

== Step 4. MapLoader Example App

Configure the data connection:

[source,java]
----
public class MapLoaderExampleApp {
public static void main(String[] args) {
Config config = new Config();

DataConnectionConfig dcc = new DataConnectionConfig("my_data_connection");
dcc.setType("JDBC");
dcc.setProperty("jdbcUrl", "jdbc:postgresql://172.17.0.2/postgres");
dcc.setProperty("user", "postgres");
dcc.setProperty("password", "postgres");
config.addDataConnectionConfig(dcc);

}
}
----

Configure an IMap named `my_map` with the map loader:

[source,java]
----
public class MapLoaderExampleApp {
public static void main(String[] args) {
// ...

MapStoreConfig mapStoreConfig = new MapStoreConfig();
mapStoreConfig.setClassName(SimpleMapLoader.class.getName());

MapConfig mapConfig = new MapConfig("my_map");
mapConfig.setMapStoreConfig(mapStoreConfig);
config.addMapConfig(mapConfig);


}
}
----

Create a `HazelcastInstance` with the `Config`, get the IMap and read some data:
[source,java]
----
public class MapLoaderExampleApp {
public static void main(String[] args) {
// ...

HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
IMap<Integer, String> map = hz.getMap("my_map");

System.out.println("1 maps to " + map.get(1));
System.out.println("42 maps to " + map.get(10));
}
}
----

When you run this class you should see the following output:

[source,text]
----
1 maps to one
42 maps to null
----
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved

== Next steps

Read through the xref:configuration:dynamic-config.adoc[Dynamic Configuration] section to find out how to add the
`DataConnection` config and new `IMap` config with `MapStore` dynamically.
Loading
Loading