Rework CSV and Fixed files source fields options, see #116.

It's not possible to use a comma separator when using more than one
source field option at the same time, and for better readability the
options are to be found enclosed in squared brackets.

Also, it's now possible to spell out "from" and "for" keywords on the
source definitions, making it easier to read and maintain the load file,
as in this full example:

          (
           a from  0 for 10,
           b from 10 for  8,
           c from 18 for  8,
           d from 26 for 17 [null if blanks, trim right whitespace]
          )
This commit is contained in:
Dimitri Fontaine 2014-10-01 18:32:40 +02:00
parent ea97fc4659
commit ac55d71401
4 changed files with 63 additions and 15 deletions

View File

@ -1,7 +1,7 @@
.\" generated with Ronn/v0.7.3
.\" http://github.com/rtomayko/ronn/tree/0.7.3
.
.TH "PGLOADER" "1" "September 2014" "ff" ""
.TH "PGLOADER" "1" "October 2014" "ff" ""
.
.SH "NAME"
\fBpgloader\fR \- PostgreSQL data loader
@ -516,7 +516,7 @@ The optional \fIIN DIRECTORY\fR clause allows specifying which directory to walk
The \fIFROM\fR option also supports an optional comma separated list of \fIfield\fR names describing what is expected in the \fBCSV\fR data file, optionally introduced by the clause \fBHAVING FIELDS\fR\.
.
.IP
Each field name can be either only one name or a name following with specific reader options for that field\. Supported per\-field reader options are:
Each field name can be either only one name or a name following with specific reader options for that field, enclosed in square brackets and comma\-separated\. Supported per\-field reader options are:
.
.IP "\(bu" 4
\fIterminated by\fR
@ -639,7 +639,13 @@ This command instructs pgloader to load data from a text file containing columns
.nf
LOAD FIXED
FROM inline (a 0 10, b 10 8, c 18 8, d 26 17)
FROM inline
(
a from 0 for 10,
b from 10 for 8,
c from 18 for 8,
d from 26 for 17 [null if blanks, trim right whitespace]
)
INTO postgresql:///pgloader?fixed
(
a, b,
@ -666,6 +672,8 @@ BEFORE LOAD DO
01234567892008052011431250firstline
01234562008052115182300left blank\-padded
12345678902008052208231560another line
2345609872014092914371500
2345678902014092914371520
.
.fi
.
@ -715,6 +723,11 @@ Position in the line where to start reading that field\'s value\. Can be entered
.IP
How many bytes to read from the \fIstart\fR position to read that field\'s value\. Same format as \fIstart\fR\.
.
.IP "" 0
.
.IP
Those optional parameters can enclosed in square brackets and comma\-separated:
.
.IP "\(bu" 4
\fIterminated by\fR
.

View File

@ -465,8 +465,8 @@ The `csv` format command accepts the following clauses and options:
optionally introduced by the clause `HAVING FIELDS`.
Each field name can be either only one name or a name following with
specific reader options for that field. Supported per-field reader
options are:
specific reader options for that field, enclosed in square brackets and
comma-separated. Supported per-field reader options are:
- *terminated by*
@ -576,7 +576,13 @@ This command instructs pgloader to load data from a text file containing
columns arranged in a *fixed size* manner. Here's an example:
LOAD FIXED
FROM inline (a 0 10, b 10 8, c 18 8, d 26 17)
FROM inline
(
a from 0 for 10,
b from 10 for 8,
c from 18 for 8,
d from 26 for 17 [null if blanks, trim right whitespace]
)
INTO postgresql:///pgloader?fixed
(
a, b,
@ -603,6 +609,8 @@ columns arranged in a *fixed size* manner. Here's an example:
01234567892008052011431250firstline
01234562008052115182300left blank-padded
12345678902008052208231560another line
2345609872014092914371500
2345678902014092914371520
The `fixed` format command accepts the following clauses and options:
@ -642,6 +650,9 @@ The `fixed` format command accepts the following clauses and options:
How many bytes to read from the *start* position to read that
field's value. Same format as *start*.
Those optional parameters can enclosed in square brackets and
comma-separated:
- *terminated by*
See the description of *field terminated by* below.

View File

@ -120,6 +120,8 @@
(def-keyword-rule "left")
(def-keyword-rule "right")
(def-keyword-rule "whitespace")
(def-keyword-rule "from")
(def-keyword-rule "for")
(def-keyword-rule "skip")
(def-keyword-rule "header")
(def-keyword-rule "null")
@ -1721,6 +1723,28 @@ load database
option-trim-left-whitespace
option-trim-right-whitespace))
(defrule another-csv-field-option (and comma csv-field-option)
(:lambda (field-option)
(destructuring-bind (comma option) field-option
(declare (ignore comma))
option)))
(defrule open-square-bracket (and ignore-whitespace #\[ ignore-whitespace)
(:constant :open-square-bracket))
(defrule close-square-bracket (and ignore-whitespace #\] ignore-whitespace)
(:constant :close-square-bracket))
(defrule csv-field-option-list (and open-square-bracket
csv-field-option
(* another-csv-field-option)
close-square-bracket)
(:lambda (option)
(destructuring-bind (open opt1 opts close) option
(declare (ignore open close))
(alexandria:alist-plist `(,opt1 ,@opts)))))
(defrule csv-field-options (? (or csv-field-option csv-field-option-list)))
(defrule csv-field-options (* csv-field-option)
(:lambda (options)
(alexandria:alist-plist options)))
@ -2038,11 +2062,11 @@ load database
(defrule number (or hex-number dec-number))
(defrule field-start-position (and ignore-whitespace number)
(:destructure (ws pos) (declare (ignore ws)) pos))
(defrule field-start-position (and (? kw-from) ignore-whitespace number)
(:destructure (from ws pos) (declare (ignore from ws)) pos))
(defrule fixed-field-length (and ignore-whitespace number)
(:destructure (ws len) (declare (ignore ws)) len))
(defrule fixed-field-length (and (? kw-for) ignore-whitespace number)
(:destructure (for ws len) (declare (ignore for ws)) len))
(defrule fixed-source-field (and csv-field-name
field-start-position fixed-field-length

View File

@ -13,11 +13,11 @@
LOAD FIXED
FROM inline
( -- col start length opts
a 0 10,
b 10 8,
c 18 8,
d 26 17 null if blanks trim right whitespace
(
a from 0 for 10,
b from 10 for 8,
c from 18 for 8,
d from 26 for 17 [null if blanks, trim right whitespace]
)
INTO postgresql:///pgloader?fixed
(