The verbosity is not that easy to adjust. Remove useless messages and add a
new one telling when the COPY of a table is done. As we might have to wait
for some time for indexes being built. keep the CREATE INDEX lines. Also
keep the ALTER TABLE both for primary keys and foreign keys, again because
the user might have to wait for quite some time.
This change was long overdue. Ideally we would use something like the YeSQL
library for Clojure, but it seems like the cl-yesql equivalent is not ready
yet, and it depends on an experimental build system...
So this patch introduces an URL abstraction built on-top of a hash table.
You can then reference src/pgsql/sql/list-all-columns.sql as
(sql "pgsql/list-all-columns.sql")
in the source code directly.
So for now the templating system is CL's format language. It is still an
improvement from embedded string. Again, one step at a time.
When pgloader fetches the index list from a source database, it doesn't
fetch information about access methods for the indexes: I don't even know if
the overlap in between index access methods from one RDMBS to another covers
more than just btree...
It could happen that MySQL indexes a "geometry" column tho. This datatype is
converted automatically to "point" by pgloader, which is good. But the index
creation would fail with the following error message:
Database error 42704: data type point has no default operator class for access method "btree"
In this patch when setting up the target schema we issue a PostgreSQL
catalog query to dynamically list those datatypes without btree support and
fetch their opclasses, with an hard-coded preference to GiST, then GIN, so
as to be able to automatically use the proper access method when btree isn't
available. And now pgloader transparently issues the proper statement:
CREATE INDEX idx_168468_idx_location ON pagila.address USING gist(location);
Currently this exploration is limited to indexes with a single column. To
implement the general case we would need a more complex lookup: we would
have to find the intersection of all the supported access methods for all
involved columns.
Of course we might need to do that someday. One step at a time is plenty
good enough tho.
In the complete PostgreSQL schema step, an error would be logged as you
expect but poorly handled: it would have the whole transaction rolled back,
meaning that a single Primary Key definition failure would cancel all the
others, plus the foreign keys, and also the triggers and comments.
It happens that other systems allow a primary column to contain NULL values,
which is forbidden in the standard and enforced by PostgreSQL, so that's not
a theoritical concern here.
The code was too complex and the transaction / connection handling wasn't
good enough, too many reconnections when a ROLLBACK; is all we need to be
able to continue our processing.
Also fix some stats counters about errors handled, and improve error message
by adding PostgreSQL explicitely, and the name of the table where the error
comes from.
It may happen that PostgreSQL is restarted while pgloader is running, or
that for some other reason we lose the connection to the server, and in most
cases we know how to gracefully reconnect and retry, so just do so.
Fixes#546 initial report.
Given new SQLite test case from issue #563 we see that pgloader doesn't
handle errors gracefully in post-copy stage. That's because the API were not
properly defined, we should use pgsql-execute-with-timing rather than other
construct here, because it allows the "on error resume next" behavior we
want with after load DDL statements.
See #563.
Many options are now available to pgloader users, including short cuts that
where not defined clearly enough. That could result in stupid things being
done at times.
In particular, when picking the "data only" option then indexes are not to
be dropped before loading the data, but pgloader would still try and create
them again at the end of the load, because the option that controls that
behavior default to true and is not impacted by the "data only" choice.
In this patch we review the logic and ensure it's applied in the same
fashion in the different phases of the database migration: preparation,
copying, rebuilding of indexes and completion of the database model.
See also 96b2af6b2a where we began fixing
oddities but didn't go far enough.
See #476 where it would have been helpful to see the PostgreSQL catalog
queries with `--log-min-messages sql` in the bug report. Also more
generally useful.
This sits between NOTICE and INFO, allowing to have a complete log of
the SQL queries sent to the server while avoiding the very verbose
trafic of the DEBUG log level.
See #498.
Replace the ad-hoc code that was used before in the load from file code
path to use our full internal catalog representation, and adjust APIs to
that end.
The goal is to use catalogs everywhere in the PostgreSQL target API and
allowing to process reason explicitely about source and target catalogs,
see #400 for the main use case.
First, add index and foreign keys to the list of objects supported by
the shared catalog facility, where is was only found in the pgsql schema
specific package for historical raisons.
Then also add to our catalog internal structures the notion of a trigger
and a stored procedure, allowing for cleaner advanced default values
support in the MySQL cast functions.
Once we now have a proper and complete catalog, review the pgsql module
DDL output function in terms of the catalog and rewrite the schema
creation support so that it takes direct benefit of our internal
catalogs representation.
In passing, clean-up the code organisation of the pgsql target support
module to be easier to work with.
Next step consists of getting rid of src/pgsql/queries.lisp: this
facility should be replaced by the usage of a target catalog that we
fetch the usual way, thanks to the new src/pgsql/pgsql-schema.lisp file
and list-all-* functions.
That will in turn allow for an explicit step of merging the pre-existing
PostgreSQL catalog when it's been created by other tools than pgloader,
that is when migrating with the help of an ORM. See #400 for details.