It might be important to be able to use the exact same pgloader commands
file but adapt its source and target depending on the environment where
the command is to be run (production, development, staging, etc).
Introduce the new sub-clause GETENV 'variable-name' to that effect.
The regression test facility that we have now isn't nearly sophisticated
enough to support this, so the feature isn't yet covered.
The previous patch didn't take into account the need to retain the case
of the PostgreSQL column names when using double-quotes in the load
command, which is now properly forwarded down in the COPY command.
With this patch it's now possible to "quote" column names and to use
_ (underscore) as the leading character in a field or column name. Of
course it is still unnecessary to quote the column names, but handling
the PostgreSQL quoting rules can't be bad here.
Generalize what the previous patch did only for the MySQL case, this
time affecting the internal parser API expectations to "flag" all
clauses return values with a per-clause keyword so that an ordinary
lambda list keyword can then be destructured.
Only the MySQL command is addressed in this patch, because the code
level approach is not safisfying me completely. It might be easier to
just bite the bullet and review all the optional clauses return values
rather than add a layer as this patch does.
The feature still is available for MySQL given this patch, so let's push
it, get feedback, then see about how to make the approach scale and
revise all the other commands.
Given than redirecting a tty such as *terminal-io* isn't easy enough,
let's provide a way to copy the summary output to a file. Another way to
solve it would have been to output the summary to the main logs, but
that could have made the logs parsing more difficult that necessary.
Let's see how users like it...
With this the user is now able to have a way about where the files are
going to be read and matched against the regular expression. It used not
to be necessary in the archive expansion mode, but is required now that
the feature is exposed in more cases.
When using LOAD CSV it's possible to load from filename matching a
regular expression, but for that to work the *csv-path-root* needs to be
properly setup at run-time.
That allows using the same SQL files as usual when using pgloader, as it
even supports the \i and \ir psql features (and dollar quoting, etc).
In passing, refactor docs to avoid saying the same things all over the
place, which isn't a very good idea in a man page, at least as far
editing it is involved.
The parser was happily parsing such a connection string as the
following, but the rest of the code didn't really know what to do about
it:
mysql://unix:/var/run/mysqld/mysqld.sock:/main
In passing, fix bugs where the PostgreSQL unix domain socket connection
was still shy of a brick load, omitting to consider the case where the
connection host is actually a list of '(:unix . "path/to/socket").
In particular, allow for a space to be used in the filename. The only
character that is not permitted anymore is the quote itself ('), it
should be easy enough to allow for escaping it as in the password field
if required.
Should probably fix#54, even though the lack of data currently reported
in that issue makes it a blind guess only.
The new WITH options allows the user to set values for the dynamic
variables *copy-batch-rows*, *copy-batch-size* and *concurrent-batches*.
That's needed in case like in issue #16 even with the batch size
defaulting to what looks like a proper setup.
In a longer term a review of the pgloader memory usage should be done
seriously, the numbers being way higher than the batch sizes we do setup
here.