Some path computation didn't work when trying to regression test the
produced bundle.
Also, the bundle building steps would use the pgloader system definition and
dependencies from what's currently available in Quicklisp rather than from
the local pgloader.asd being built.
It used to be that extra were forced to being parsed before guards, but
there's no reason why a user wouldn't think to write its clauses the other
way round, so add support for that as well.
See #779.
Apparently cl+ssl needs to be reloaded a very specific way at image startup
time, and provides a function to do just that. Let's try and use this piece
of magic rather cffi:load-foreign-library directly.
The MySQL connection string parameter for SSL usage is useSSL, so map an
option name to our expected values for sslmode in database connection
strings.
See #748.
The default logfile location seems to be `/tmp/pgloader/pgloader.log`,
not `/tmp/pgloader.log` as currently documented. This is observable in
practice and also in [the source
code](5b227200a9/src/main.lisp (L110)).
We would hard-code the schema name into the table's name in the DB3 case on
the grounds that a db3/dbf file doesn't have a notion of a schema. But when
the user wants to add data into an existing target table, then we merge the
catalogs and must keep the given target schema and table name.
Fix#701.
Accept empty password lines in ~/.pgpass files, and when otherwise pgloader
fails to parse or process the file log a warning and return a nil password.
See #748.
In a previous commit we re-used the package name pgloader.copy for the now
separated implementation of the COPY protocol, but this package was already
in use for the implementation of the COPY file format as a pgloader source.
Oops.
And CCL was happily doing its magic anyway, so that I've been blind to the
problem.
To fix, rename the new package pgloader.pgcopy, and to avoid having to deal
with other problems of the same kind in the future, rename every source
package pgloader.source.<format>, so that we now have pgloader.source.copy
and pgloader.pgcopy, two visibily different packages to deal with.
This light refactoring came with a challenge tho. The split in between the
pgloader.sources API and the rest of the code involved some circular
depencendies in the namespaces. CL is pretty flexible here because it can
reload code definitions at runtime, but it was still a mess. To untangle it,
implement a new namespace, the pgloader.load package, where we can use the
pgloader.sources API and the pgloader.connection and pgloader.pgsql APIs
too.
A little problem gave birth to quite a massive patch. As it happens when
refactoring and cleaning-up the dirt in any large enough project, right?
See #748.
In data-only mode, the foreign keys parameter (which defaults to True) means
something special: we remove the fkey definitions prior to the data only
load then re-install the fkeys.
This got broken in a previous commit, the WITH clause option being processed
like the other DDL ones that only make sense when creating the schema. While
fixing the setting in copy-database, we have to also fix a nesting bug in
complete-pgsql-database that would prevent fkey to be installed again at the
end of the load.
This patch not only fix that choice, but also review the implementation of
the drop-pgsql-fkeys support function to use more modern internal API,
preparing a list of SQL statements to be sent to the psql-execute level.
Fixes#745.
Not all error paths are counted correctly at this point, this commit
improves the situation in passing. A thorough review should probably be
planned sometime.
Several places in the code are involved to deal with the default values from
MS SQL. The catalog query is dealing with strange quoting rules on the
source side and used to fill in directly the PostgreSQL expected value. But
then the quoting of a function call wasn't properly handled.
Rather than coping with the quoting rules here, have the catalog query
return a pgloader specific placeholder "GENERATE_UUID". Then the MS SQL
specific code can normalize that to the symbol :generate_uuid. Then the
generic PostgreSQL DDL code can implement the proper replacement for that
symbol, not having to know where it comes from.
Fix#742.
PostgreSQL understands both spellings of the data type name and implements
float as being a double precision value, so we should refrain from any
warning about that non-discrepency when doing a data-only load.
Should fix#746.
Some copy-paste errors made their way to those queries and prevented usage
of pgloader, but I missed that because I was using a previous version of the
query text files in my interactive environment.
Also, SQLite doesn't like the queries finishing with a semi-colon, so remove
them.
Fixes#747.
The support for drop default in (user defined) casting rules was completely
broken in SQLite, because the code didn't even bother looking at what's
returning after applying the casting rules.
This patch fixes the code so that is uses the pgcol instance's default
value, as per after applying casting rules. The bug also existed in a subtle
form for MySQL and MS SQL, but would only show up there when the default
value is spelled using a known variation of “current timestamp”.
First review the `sqlite_sequence` support so that we can still work with
databases that don't have this catalog, which doesn't always exists -- it
might depend on the SQLite version though.
Then while at it use the sql macro to host the SQLite “queries” in their own
files, enhancing the hackability of the system to some degrees. Not that
much, because we have to use a lot of PGRAMA command and then the column
output isn't documented with the query text itself.
The handling of the SQLite catalogs where fixed in a previous patch, but
either it's been broken in between or it never actually worked (oops).
Moreover, the recent patch about :on-update-current-timestamp changed the
casting rules matching code and we should position :auto-increment from the
SQLite module rather than "auto_increment" as before. That's better, but
wasn't done.
Fix#563 again, tested with a provided test-case (thanks!).
Namely the actions are “keep extra” and “drop extra” and the casting rule
guard is “with extra on update current timestamp”. Having support for those
elements in the casting rules allow such a definition as the following:
type timestamp with extra on update current timestamp
to "timestamp with time zone" drop extra
The effect of such as cast rule would be to ignore the MySQL extra
definition and then refrain pgloader from creating the PostgreSQL triggers
that implement the same behavior.
Fix#735.
In case of a failure to pre-process or transform values in the row that as
been read, we need to refrain from pushing the row into our next batch.
See #726, that got hit by the recent bug in the middle of something else
entirely.
When dealing with MATERIALIZING VIEWS test cases and failing in the middle
of them, as it happens when fixing bugs, then it was tedious (to say the
least) to clean-up manually the view each time.
That said, for end-users, doing it automatically would risk cleaning-up the
wrong view definition if they had a typo in their pgloader command, say.
Common Lisp helps a lot here: we simply create a restart that is only
available interactively for the developers of pgloader!
We forgot that rule in the case of creating the target tables for the
materializing views commands, which led to surprising and wrong behavior.
Fix#721, and add a new test case while at it.
It might be that the schema exists but we didn't find what we expected to
in there, so that it didn't make it to pgloader's internal catalogs. Be
friendly to the user with a better error message.
Fix#713.
Refactor file organisation further to allow for adding a “direct stream”
option when the on-error-stop behavior has been selected. This happens
currently by default for databases sources.
Introduce the new WITH option “on error resume next” which forces the
classic behavior of pgloader. The option “on error stop” already existed,
its implementation is new.
When this new behavior is activated, the data is sent to PostgreSQL
directly, without intermediate batches being built. It means that the whole
operation fails at the first error, and we don't have any information in
memory to try replaying any COPY of the data. It's gone.
This behavior should be fine for database migrations as you don't usually
want to fix the data manually in intermediate files, you want to fix the
problem at the source database and do the whole dance all-over again, up
until your casting rules are perfect.
This patch might also incurr some performance benenits in terms of both
timing and memory usage, though the local testing didn't show much of
anything for the moment.
Copy some code over from cl-postgres-trivial-utf-8 and add the support for
PostgreSQL COPY escaping right at the same place, allowing to allocate our
formatted utf-8 buffer only once, with the escaping already installed.
This patch was expected to be more about perfs, but it's actually only about
code cleaning it seems, as it doesn't make a big difference in the testing I
could do here.
That said, getting rid of one intermediate buffer should be nice in terms of
memory management.
The copy format and batch facilities are no longer the meat of your
PostgreSQL support in the src/pgsql directory, so have them leave in their
own space.
This function prepares the data to be sent down to PostgreSQL as a clean
COPY text with unicode handled correctly. This commit is mainly a clean-up
of the function, and also adds some smarts to try and make it faster.
In testing, the function is now tangentially faster than before, but not by
much. The hope here is that it's now easier to optimize it.
We now have a qmynd-impl::decoding-error condition to deal with, which as a
very good error reporting, so that we don't need to poke into babel details
anymore. The error message adds the column name, type and collation to the
output, too.
We keep the babel handlers for a while until people have all migrated to
using the patch in qmynd.
With the fix to Qmynd, Fix#716.
The previous patch introduced parser conflicts and we couldn't parse some
expressions any more, such as the following:
fields escaped by '\',
It's now possible to represent single quote as either '''', '\'', or '0x27'
and we still can parse '\' as being a single backslash character.
See #705.
The option "fields optionally enclosed by" was missing a way to easily
specify a single quote as the quoting character. Add '\'' to the existing
solution '0x27' which isn't as friendly.
See #705.
The query for concurrency-support didn't get the memo that we should ignore
PostgreSQL identifier-case when querying the source MySQL database. Fix the
query string to include column names as given by the MySQL catalogs.
In bug report #703, the problem is found in PostgreSQL queries. This has
been fixed before already. Trying to reproduce the bug produced an error in
the concurrency-support query instead, so let's fix this one.
Fix#703.
The website is moving to pgloader.org and readthedocs.io is going to be
integrated. Let's see what happens. The docs build fine locally with the
sphinx tools and the docs/Makefile.
Having separate files for the documentation should help ease the maintenance
and add new topics, such as support for Common Lisp Hackers level docs,
which are currently missing.