From 7344e1d81eec74563e3b4eabe2da766f5bcddb56 Mon Sep 17 00:00:00 2001 From: Dimitri Fontaine Date: Wed, 18 May 2016 11:07:28 +0200 Subject: [PATCH] Improve docs for FILENAMES MATCHING support. This format of source file specifications is available for CSV, COPY and FIXED formats but was only documented for the CSV one. The paragraph is copy/pasted around in the hope to produce per-format man pages and web documentation in a fully automated way sometime. Fix #397. --- pgloader.1 | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++- pgloader.1.md | 47 ++++++++++++++++++++++++++++++++++++- 2 files changed, 110 insertions(+), 2 deletions(-) diff --git a/pgloader.1 b/pgloader.1 index 6c24293..5df26bd 100644 --- a/pgloader.1 +++ b/pgloader.1 @@ -1,7 +1,7 @@ .\" generated with Ronn/v0.7.3 .\" http://github.com/rtomayko/ronn/tree/0.7.3 . -.TH "PGLOADER" "1" "April 2016" "ff" "" +.TH "PGLOADER" "1" "May 2016" "ff" "" . .SH "NAME" \fBpgloader\fR \- PostgreSQL data loader @@ -1206,6 +1206,30 @@ The data is found after the end of the parsed commands\. Any number of empty lin .IP Reads the data from the standard input stream\. . +.IP "\(bu" 4 +\fIFILENAMES MATCHING\fR +. +.IP +The whole \fImatching\fR clause must follow the following rule: +. +.IP "" 4 +. +.nf + +[ ALL FILENAMES | [ FIRST ] FILENAME ] +MATCHING regexp +[ IN DIRECTORY \'\.\.\.\' ] +. +.fi +. +.IP "" 0 +. +.IP +The \fImatching\fR clause applies given \fIregular expression\fR (see above for exact syntax, several options can be used here) to filenames\. It\'s then possible to load data from only the first match of all of them\. +. +.IP +The optional \fIIN DIRECTORY\fR clause allows specifying which directory to walk for finding the data files, and can be either relative to where the command file is read from, or absolute\. The given directory must exists\. +. .IP "" 0 . .IP @@ -1398,6 +1422,45 @@ The \fBCOPY\fR format command accepts the following clauses and options: . .IP Filename where to load the data from\. This support local files, HTTP URLs and zip files containing a single dbf file of the same name\. Fetch such a zip file from an HTTP address is of course supported\. +. +.IP "\(bu" 4 +\fIinline\fR +. +.IP +The data is found after the end of the parsed commands\. Any number of empty lines between the end of the commands and the beginning of the data is accepted\. +. +.IP "\(bu" 4 +\fIstdin\fR +. +.IP +Reads the data from the standard input stream\. +. +.IP "\(bu" 4 +\fIFILENAMES MATCHING\fR +. +.IP +The whole \fImatching\fR clause must follow the following rule: +. +.IP "" 4 +. +.nf + +[ ALL FILENAMES | [ FIRST ] FILENAME ] +MATCHING regexp +[ IN DIRECTORY \'\.\.\.\' ] +. +.fi +. +.IP "" 0 +. +.IP +The \fImatching\fR clause applies given \fIregular expression\fR (see above for exact syntax, several options can be used here) to filenames\. It\'s then possible to load data from only the first match of all of them\. +. +.IP +The optional \fIIN DIRECTORY\fR clause allows specifying which directory to walk for finding the data files, and can be either relative to where the command file is read from, or absolute\. The given directory must exists\. +. +.IP "" 0 + . .IP "\(bu" 4 \fIWITH\fR diff --git a/pgloader.1.md b/pgloader.1.md index 4c10bf6..721d0c5 100644 --- a/pgloader.1.md +++ b/pgloader.1.md @@ -1070,7 +1070,25 @@ The `fixed` format command accepts the following clauses and options: Reads the data from the standard input stream. - The *FROM* option also supports an optional comma separated list of + - *FILENAMES MATCHING* + + The whole *matching* clause must follow the following rule: + + [ ALL FILENAMES | [ FIRST ] FILENAME ] + MATCHING regexp + [ IN DIRECTORY '...' ] + + The *matching* clause applies given *regular expression* (see above + for exact syntax, several options can be used here) to filenames. + It's then possible to load data from only the first match of all of + them. + + The optional *IN DIRECTORY* clause allows specifying which directory + to walk for finding the data files, and can be either relative to + where the command file is read from, or absolute. The given + directory must exists. + + The *FROM* option also supports an optional comma separated list of *field* names describing what is expected in the `FIXED` data file. Each field name is composed of the field name followed with specific @@ -1207,6 +1225,33 @@ The `COPY` format command accepts the following clauses and options: URLs and zip files containing a single dbf file of the same name. Fetch such a zip file from an HTTP address is of course supported. + - *inline* + + The data is found after the end of the parsed commands. Any number + of empty lines between the end of the commands and the beginning of + the data is accepted. + + - *stdin* + + Reads the data from the standard input stream. + + - *FILENAMES MATCHING* + + The whole *matching* clause must follow the following rule: + + [ ALL FILENAMES | [ FIRST ] FILENAME ] + MATCHING regexp + [ IN DIRECTORY '...' ] + + The *matching* clause applies given *regular expression* (see above + for exact syntax, several options can be used here) to filenames. + It's then possible to load data from only the first match of all of + them. + + The optional *IN DIRECTORY* clause allows specifying which directory + to walk for finding the data files, and can be either relative to + where the command file is read from, or absolute. The given + directory must exists. - *WITH*