Skip to content

Commit

Permalink
readme: clarify and indent
Browse files Browse the repository at this point in the history
  • Loading branch information
steve-chavez committed Dec 15, 2023
1 parent 4b11efe commit 0ad4df6
Showing 1 changed file with 37 additions and 36 deletions.
73 changes: 37 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,67 +2,68 @@

## Motivation

If you obtain data compressed as bzip2, whether through HTTP (with [pgsql-http](https://github.com/pramsey/pgsql-http)) or from a file
(with [pgsql-fio](https://github.com/csimsek/pgsql-fio) or the native [pg_read_binary_file](https://pgpedia.info/p/pg_read_binary_file.html)), it's convenient to
decompress it in SQL directly. This extension is just for that, it provides functions to decompress and compress data using bzip2.
If you get data compressed as bzip2, whether through [HTTP](https://github.com/pramsey/pgsql-http) or from a file, it's convenient to decompress it in SQL.
`pg_bzip` does that, it provides functions to decompress and compress data using bzip2.

## Functions

- `bzcat(data bytea) returns bytea`

This function mimics the [bzcat](https://linux.die.net/man/1/bzcat) command, which decompresses data using bzip2.

```sql
select convert_from(bzcat(pg_read_binary_file('/path/to/all_movies.csv.bz2')), 'utf8') as contents;

contents
--------------------------------------------------------------------------------------------------------------------------------------------
"id","name","parent_id","date" +
"2","Ariel","8384","1988-10-21" +
"3","Varjoja paratiisissa","8384","1986-10-17" +
"4","État de siège",\N,"1972-12-30" +
"5","Four Rooms",\N,"1995-12-22" +
"6","Judgment Night",\N,"1993-10-15" +
"8","Megacities - Life in Loops",\N,"2006-01-01" +
"9","Sonntag, im August",\N,"2004-09-22" +
"11","Star Wars: Episode IV – A New Hope","10","1977-05-25" +
"12","Finding Nemo","112246","2003-05-30" +
...
....
.....
```
This function mimics the [bzcat](https://linux.die.net/man/1/bzcat) command, which decompresses data using bzip2.

For this example, we'll use the native [pg_read_binary_file](https://pgpedia.info/p/pg_read_binary_file.html) to read from a file.

```sql
select convert_from(bzcat(pg_read_binary_file('/path/to/all_movies.csv.bz2')), 'utf8') as contents;

contents
--------------------------------------------------------------------------------------------------------------------------------------------
"id","name","parent_id","date" +
"2","Ariel","8384","1988-10-21" +
"3","Varjoja paratiisissa","8384","1986-10-17" +
"4","État de siège",\N,"1972-12-30" +
"5","Four Rooms",\N,"1995-12-22" +
"6","Judgment Night",\N,"1993-10-15" +
"8","Megacities - Life in Loops",\N,"2006-01-01" +
"9","Sonntag, im August",\N,"2004-09-22" +
"11","Star Wars: Episode IV – A New Hope","10","1977-05-25" +
"12","Finding Nemo","112246","2003-05-30" +
...
....
.....
```

- `bzip2(data bytea, compression_level int default 9) returns bytea`

This function is a simplified version of the [bzip2](https://linux.die.net/man/1/bzip2) command. It compresses data using bzip2.
This function is a simplified version of the [bzip2](https://linux.die.net/man/1/bzip2) command. It compresses data using bzip2.

For this example we'll use `fio_writefile` from [pgsql-fio](https://github.com/csimsek/pgsql-fio), which offers a convenient way to write a file from SQL.
For this example we'll use `fio_writefile` from [pgsql-fio](https://github.com/csimsek/pgsql-fio), which offers a convenient way to write a file from SQL.

```sql
select fio_writefile('/home/stevechavez/Projects/pg_bzip/my_text.bz2', bzip2(repeat('my secret text to be compressed', 1000)::bytea)) as writesize;
```sql
select fio_writefile('/home/stevechavez/Projects/pg_bzip/my_text.bz2', bzip2(repeat('my secret text to be compressed', 1000)::bytea)) as writesize;

writesize
-----------
109
```
writesize
-----------
109
```

## Installation

bzip2 is required. Under Debian/Ubuntu you can get it with

```
```bash
sudo apt install libbz2-dev
```

Then on this repo

```
```bash
make && make install
```

Now on SQL you can do:

```
```sql
CREATE EXTENSION bzip;
```

Expand All @@ -72,7 +73,7 @@ CREATE EXTENSION bzip;

[Nix](https://nixos.org/download.html) is used to get an isolated and reproducible enviroment with multiple postgres versions.

```
```bash
# enter the Nix environment
$ nix-shell

Expand Down

0 comments on commit 0ad4df6

Please sign in to comment.