This is a blind patch given that I couldn't CREATE TABLE as per the bug
report to try and see by myself what's happening. Better have some tests
going on though.
PostgreSQL requires that an idenfitier begin with letters or underscore
only, so an identifier that begins with a digit must be quoted. In the
current coding pgloader will unecessarily quote some identifiers that
begin with a unicode accentuated letter, but that's only cosmetic and
isn't worth worrying about (famous last words).
That should help fixing #158 where MS SQL uses the following name for
one of its fkey: fk_dbo.track_dbo.artist_artistid. PostgreSQL refuses fk
names with dots in it.
Given the slashdot effect and some bad luck, the binary artefacts of the
3.2.0 release are not currently available, and anyway contain known bugs
that have been fixed meanwhile thanks to early adopters who did open
issues on github.
So we hastily publish the current master's branch version as a github
release with binary files.
The bug is related to the processing of empty-lines in the middle of
quoted text by cl-csv, which state machine has gotten quite complex to
be able to handle all the crazy different csv variants out there.
Testing shows the bug is fixed in pgloader by just updating cl-csv.
The option currently only works within the same build environment where
the image was first build, as noted in #133. This is an attempt at
convincing ASDF not to load systems that pgloader depends on in order to
be able to load only the new pgloader definition.
While it looks sound in principle, I failed to have it work in the lab.
Given that previous to this patch nothing works at all, it's not a
regression, let's push it as is makes the code saner.
Also, it looks like asdf::*immutable-systems* is what we want here, but
that's asdf 3.1.x and we're not there yet.
Loading external libs at application startup time is not convenient as
it forces users to install freetds everywhere even when they don't need
it. This patch makes it so that freetds is only loaded when pgloader is
asked to load from a MS SQL database source.
Note that we could have done the same for SSL if it wasn't possibly used
to connect to PostgreSQL, which isn't optional in current pgloader
implementation.
It's now possible to have pgloader print out its summary in one of
several formats: human-readable (default), csv, copy or json. The
choice of format is made depending on the extension of the summary
filename picked on the command line with the option --summary.
Fix bugs related to parsing the new COPY type, and make it so that we
know how to parse options (and fields, and other type dependant things)
even when --type is missing, in care the source URL has the information.
PostgreSQL COPY format is not really CSV but something way easier to
parse. Funnily enough, parsing it as CSV is not that easy, so we add
here a special simple parser for the COPY format.
It should be quite useful too try loading again reject data files from
pgloader after manual fixing, too. It's still missing some documentation
without any good excuse for that, will add soon.
Also augment the documentation with examples of bare stdin reading and
of advantages of the unix pipes to stream even remove archived content
down to PostgreSQL.
In passing also allow --field to specify the whole field list, there's
no point in forcing the user to have as many --field switches on the
command line as they have columns in their data source file.