Commit Graph

14 Commits

Author SHA1 Message Date
Dimitri Fontaine
e129e77eb6 Fix SQL execute counters maintenance. 2018-02-19 22:06:51 +01:00
Dimitri Fontaine
03a8d57a50 Review --verbose log message.
The verbosity is not that easy to adjust. Remove useless messages and add a
new one telling when the COPY of a table is done. As we might have to wait
for some time for indexes being built. keep the CREATE INDEX lines. Also
keep the ALTER TABLE both for primary keys and foreign keys, again because
the user might have to wait for quite some time.
2017-08-21 15:27:13 +02:00
Dimitri Fontaine
e37cb3a9e7 Split SQL queries into their own files.
This change was long overdue. Ideally we would use something like the YeSQL
library for Clojure, but it seems like the cl-yesql equivalent is not ready
yet, and it depends on an experimental build system...

So this patch introduces an URL abstraction built on-top of a hash table.
You can then reference src/pgsql/sql/list-all-columns.sql as

  (sql "pgsql/list-all-columns.sql")

in the source code directly.

So for now the templating system is CL's format language. It is still an
improvement from embedded string. Again, one step at a time.
2017-07-06 03:16:05 +02:00
Dimitri Fontaine
26d372bca3 Implement support for non-btree indexes (e.g. MySQL spatial keys).
When pgloader fetches the index list from a source database, it doesn't
fetch information about access methods for the indexes: I don't even know if
the overlap in between index access methods from one RDMBS to another covers
more than just btree...

It could happen that MySQL indexes a "geometry" column tho. This datatype is
converted automatically to "point" by pgloader, which is good. But the index
creation would fail with the following error message:

  Database error 42704: data type point has no default operator class for access method "btree"

In this patch when setting up the target schema we issue a PostgreSQL
catalog query to dynamically list those datatypes without btree support and
fetch their opclasses, with an hard-coded preference to GiST, then GIN, so
as to be able to automatically use the proper access method when btree isn't
available. And now pgloader transparently issues the proper statement:

  CREATE INDEX idx_168468_idx_location ON pagila.address USING gist(location);

Currently this exploration is limited to indexes with a single column. To
implement the general case we would need a more complex lookup: we would
have to find the intersection of all the supported access methods for all
involved columns.

Of course we might need to do that someday. One step at a time is plenty
good enough tho.
2017-07-06 00:42:43 +02:00
Dimitri Fontaine
8405c331a9 Error handling improvements for PostgreSQL schema.
In the complete PostgreSQL schema step, an error would be logged as you
expect but poorly handled: it would have the whole transaction rolled back,
meaning that a single Primary Key definition failure would cancel all the
others, plus the foreign keys, and also the triggers and comments.

It happens that other systems allow a primary column to contain NULL values,
which is forbidden in the standard and enforced by PostgreSQL, so that's not
a theoritical concern here.
2017-07-05 17:53:33 +02:00
Dimitri Fontaine
3f7853491f Refactor PostgreSQL error handling.
The code was too complex and the transaction / connection handling wasn't
good enough, too many reconnections when a ROLLBACK; is all we need to be
able to continue our processing.

Also fix some stats counters about errors handled, and improve error message
by adding PostgreSQL explicitely, and the name of the table where the error
comes from.
2017-07-04 01:41:08 +02:00
Dimitri Fontaine
1e436555a8 Refactor PostgreSQL conditions.
Use a single deftype postgresql-unavailable rather than copy/pasting the
same list of conditions in several places.
2017-06-29 14:08:52 +02:00
Dimitri Fontaine
cea82a6aa8 Reconnect to PostgreSQL in case of connection lost.
It may happen that PostgreSQL is restarted while pgloader is running, or
that for some other reason we lose the connection to the server, and in most
cases we know how to gracefully reconnect and retry, so just do so.

Fixes #546 initial report.
2017-06-29 12:34:34 +02:00
Dimitri Fontaine
25e5ea9ac3 Refactor error handling in complete-pgsql-database.
Given new SQLite test case from issue #563 we see that pgloader doesn't
handle errors gracefully in post-copy stage. That's because the API were not
properly defined, we should use pgsql-execute-with-timing rather than other
construct here, because it allows the "on error resume next" behavior we
want with after load DDL statements.

See #563.
2017-06-08 12:09:11 +02:00
Dimitri Fontaine
1023577f50 Review internal database migration logic.
Many options are now available to pgloader users, including short cuts that
where not defined clearly enough. That could result in stupid things being
done at times.

In particular, when picking the "data only" option then indexes are not to
be dropped before loading the data, but pgloader would still try and create
them again at the end of the load, because the option that controls that
behavior default to true and is not impacted by the "data only" choice.

In this patch we review the logic and ensure it's applied in the same
fashion in the different phases of the database migration: preparation,
copying, rebuilding of indexes and completion of the database model.

See also 96b2af6b2a where we began fixing
oddities but didn't go far enough.
2017-02-26 14:48:36 +01:00
Dimitri Fontaine
8da09d7bed Log PostgreSQL Catalog queries at SQL log level.
See #476 where it would have been helpful to see the PostgreSQL catalog
queries with `--log-min-messages sql` in the bug report. Also more
generally useful.
2017-01-10 21:12:34 +01:00
Dimitri Fontaine
381ba18b50 Add a new log level: SQL.
This sits between NOTICE and INFO, allowing to have a complete log of
the SQL queries sent to the server while avoiding the very verbose
trafic of the DEBUG log level.

See #498.
2017-01-03 22:27:17 +01:00
Dimitri Fontaine
2d47c4f0f5 Use internal catalog when loading from files.
Replace the ad-hoc code that was used before in the load from file code
path to use our full internal catalog representation, and adjust APIs to
that end.

The goal is to use catalogs everywhere in the PostgreSQL target API and
allowing to process reason explicitely about source and target catalogs,
see #400 for the main use case.
2016-08-05 11:42:06 +02:00
Dimitri Fontaine
2aedac7037 Improve our internal catalog representation.
First, add index and foreign keys to the list of objects supported by
the shared catalog facility, where is was only found in the pgsql schema
specific package for historical raisons.

Then also add to our catalog internal structures the notion of a trigger
and a stored procedure, allowing for cleaner advanced default values
support in the MySQL cast functions.

Once we now have a proper and complete catalog, review the pgsql module
DDL output function in terms of the catalog and rewrite the schema
creation support so that it takes direct benefit of our internal
catalogs representation.

In passing, clean-up the code organisation of the pgsql target support
module to be easier to work with.

Next step consists of getting rid of src/pgsql/queries.lisp: this
facility should be replaced by the usage of a target catalog that we
fetch the usual way, thanks to the new src/pgsql/pgsql-schema.lisp file
and list-all-* functions.

That will in turn allow for an explicit step of merging the pre-existing
PostgreSQL catalog when it's been created by other tools than pgloader,
that is when migrating with the help of an ORM. See #400 for details.
2016-08-01 23:14:58 +02:00