This attempts to implement some of the SQL datalink functionality on PostgreSQL. It is very much a prototype and meant for playing around to see if this can be made useful. It implements a number of SQL/MED specified datalink behaviours.
DATALINK is special SQL type intended to store references to external files in the database. Only references to external files are stored in the database, not the content of the files themselves. Files are addressed with Uniform Resource Locators (URLs).
For detailed documentation see Datalink manual.
The standard states: "The purpose of datalinks is to provide a mechanism to synchronize the integrity control, recovery, and access control of the files and the SQL-data associated with them. "
Currently, it implements the following:
- SQL/MED DATALINK type, currently defined as a base type (a variant of jsonb)
- SQL/MED DATALINK constructors DLVALUE, DLPREVIOUSCOPY and DLNEWCOPY
- SQL/MED functions DLURLCOMPLETE, DLURLCOMPLETEONLY
- SQL/MED functions DLURLPATH, DLURLPATHONLY
- SQL/MED functions DLURLSCHEME, DLURLSERVER
- DLREPLACECONTENT function
- DLLINKTYPE function
- DLCOMMENT function
- Setting of link control options (LCOs) with
UPDATE DATALINK.COLUMNS
or by using type modifier (i.e.datalink(52)
) - Event and other triggers to make all of this 'just work'
- Token generator (uses uuid-ossp)
- LCO: NO LINK CONTROL - only check for valid URLs and normalize them
- LCO: FILE LINK CONTROL INTEGRITY SELECTIVE - check if file exists with CURL HEAD, this also works for web
- LCO: FILE LINK CONTROL INTEGRITY ALL - keep track of linked files in
datalink.dl_linked_files
table - Simple file manager to provide other LCOs, see below
With datalinker:
- LCO: READ ACCESS DB - make file owned by database (chown, chmod)
- LCO: WRITE ACCESS BLOCKED - make file immutable (chattr +i on extfs), forbid datalink column updates
- LCO: WRITE ACCESS ADMIN - make file immutable, allow datalink column updates
- LCO: WRITE ACCESS TOKEN - make file immutable, allow column updates only with matching write token
- LCO: RECOVERY YES - backup and restore of linked file contents, point in time recovery
- LCO: ON UNLINK RESTORE - restore file permissions upon unlink
- LCO: ON UNLINK DELETE - delete file when no longer referenced
Missing features:
- SQL/MED functions DLURLCOMPLETEWRITE, DLURLPATHWRITE
Extra features not in SQL standard:
- URI handling functions
uri_get()
anduri_set()
, uses pguri - PlPerlu interface
curl_get()
to curl via WWW::Curl - Directory and role based permission management for file system access control
- Directory to URL mapping
- Foreign server support for file:// URLs (for files on other postgres_fdw foreign servers)
- Compatibility functions from other databases
This extension uses a number of advanced Postgres features for implementation, including custom types, transactions, jsonb, triggers, updatable views, uuids, listen/notify, file_fdw, plperlu, dblink, advisory locks... Currently it is fully implemented in high-level languages (no C), mostly in sql, plpgsql and plperlu. However it requires pguri extension for URL processing and curl for integrity checking. All these together provide a powerful file and web framework within SQL environment.
You will need to have
WWW::Curl
Perl package installed, as it is used by the extension.
On Debian, you will need to install libwww-curl-perl
and also libdbd-pg-perl
and libconfig-tiny-perl
packages.
apt install libwww-curl-perl libdbd-pg-perl libconfig-tiny-perl liburiparser-dev
Also required is pguri extension, which must be installed separately. It is configured as a github submodule, so you can get it by:
git submodule init
git submodule update
To build and install this module:
make
make install
make install installcheck
or selecting a specific PostgreSQL installation:
make PG_CONFIG=/some/where/bin/pg_config
make PG_CONFIG=/some/where/bin/pg_config install
And finally inside the database:
CREATE EXTENSION datalink;
This requires superuser privileges.
This extension lives mostly in the datalink
schema.
SQL/MED standard compliant functions are installed in pg_catalog
schema,
so they are accessible regardless of the search_path.
Event trigger datalink_event_trigger
is installed.
It takes care of adding and removing datalink triggers on tables, which contain datalink columns.
Datalink triggers take care of referencing and dereferencing datalinks as values are assigned to datalink columns.
A special type DATALINK is provided. It behaves like SQL/MED DATALINK type. When creating table columns of this type, datalink triggers are automatically installed on the table.
create table sample_datalinks (
id serial,
url text,
link datalink
);
insert into sample_datalinks (link)
values (dlvalue('http://www.debian.org'));
select dlurlcomplete(link)
from sample_datalinks;
SQL/MED syntax to set link control options for a column is not supported, but you can use normal SQL UPDATE on table DATALINK.COLUMNS to set them instead.
update datalink.columns
set integrity='ALL',
read_access='DB', write_access='BLOCKED',
recovery='YES', on_unlink='RESTORE'
where table_name='sample_datalinks' and column_name='link';
If the datalink column already contains data when changing options then there are some limitiations. It is best to set options in advance, before storing any data.
For further examples see Datalink manual.
Constructors for values of type datalink:
DLVALUE(url[,link_type][,comment]) → datalink
(for INSERT)DLNEWCOPY(datalink,tokenp) → datalink
(for UPDATE)DLPREVIOUSCOPY(datalink,tokenp) → datalink
(for UPDATE)DLREPLACECONTENT(datalink,datalink) → datalink
(for INSERT and UPDATE)
Functions for extracting information from datalink type:
DLURLCOMPLETE(datalink) → url
DLURLCOMPLETEONLY(datalink) → url
DLURLPATH(datalink) → path
DLURLPATHONLY(datalink) → path
DLURLSERVER(datalink) → text
DLURLSCHEME(datalink) → text
DLCOMMENT(datalink) → text
DLLINKTYPE(datalink) → text
- Datalink manual
- Slides on design of datalink for Postgres (old)
- Tests contain some examples
- https://wiki.postgresql.org/wiki/DATALINK
- SQL/MED standard
- darold/datalink - another implementation of datalink for Postgres by Gilles Darold, check out especially his presentation.
- IBM Redbook™ about Datalinks