pgloader

mirror of https://github.com/dimitri/pgloader.git synced 2025-08-10 08:17:00 +02:00

Author	SHA1	Message	Date
Dimitri Fontaine	e129e77eb6	Fix SQL execute counters maintenance.	2018-02-19 22:06:51 +01:00
Dimitri Fontaine	03a8d57a50	Review --verbose log message. The verbosity is not that easy to adjust. Remove useless messages and add a new one telling when the COPY of a table is done. As we might have to wait for some time for indexes being built. keep the CREATE INDEX lines. Also keep the ALTER TABLE both for primary keys and foreign keys, again because the user might have to wait for quite some time.	2017-08-21 15:27:13 +02:00
Dimitri Fontaine	e37cb3a9e7	Split SQL queries into their own files. This change was long overdue. Ideally we would use something like the YeSQL library for Clojure, but it seems like the cl-yesql equivalent is not ready yet, and it depends on an experimental build system... So this patch introduces an URL abstraction built on-top of a hash table. You can then reference src/pgsql/sql/list-all-columns.sql as (sql "pgsql/list-all-columns.sql") in the source code directly. So for now the templating system is CL's format language. It is still an improvement from embedded string. Again, one step at a time.	2017-07-06 03:16:05 +02:00
Dimitri Fontaine	26d372bca3	Implement support for non-btree indexes (e.g. MySQL spatial keys). When pgloader fetches the index list from a source database, it doesn't fetch information about access methods for the indexes: I don't even know if the overlap in between index access methods from one RDMBS to another covers more than just btree... It could happen that MySQL indexes a "geometry" column tho. This datatype is converted automatically to "point" by pgloader, which is good. But the index creation would fail with the following error message: Database error 42704: data type point has no default operator class for access method "btree" In this patch when setting up the target schema we issue a PostgreSQL catalog query to dynamically list those datatypes without btree support and fetch their opclasses, with an hard-coded preference to GiST, then GIN, so as to be able to automatically use the proper access method when btree isn't available. And now pgloader transparently issues the proper statement: CREATE INDEX idx_168468_idx_location ON pagila.address USING gist(location); Currently this exploration is limited to indexes with a single column. To implement the general case we would need a more complex lookup: we would have to find the intersection of all the supported access methods for all involved columns. Of course we might need to do that someday. One step at a time is plenty good enough tho.	2017-07-06 00:42:43 +02:00
Dimitri Fontaine	8405c331a9	Error handling improvements for PostgreSQL schema. In the complete PostgreSQL schema step, an error would be logged as you expect but poorly handled: it would have the whole transaction rolled back, meaning that a single Primary Key definition failure would cancel all the others, plus the foreign keys, and also the triggers and comments. It happens that other systems allow a primary column to contain NULL values, which is forbidden in the standard and enforced by PostgreSQL, so that's not a theoritical concern here.	2017-07-05 17:53:33 +02:00
Dimitri Fontaine	3f7853491f	Refactor PostgreSQL error handling. The code was too complex and the transaction / connection handling wasn't good enough, too many reconnections when a ROLLBACK; is all we need to be able to continue our processing. Also fix some stats counters about errors handled, and improve error message by adding PostgreSQL explicitely, and the name of the table where the error comes from.	2017-07-04 01:41:08 +02:00
Dimitri Fontaine	1e436555a8	Refactor PostgreSQL conditions. Use a single deftype postgresql-unavailable rather than copy/pasting the same list of conditions in several places.	2017-06-29 14:08:52 +02:00
Dimitri Fontaine	cea82a6aa8	Reconnect to PostgreSQL in case of connection lost. It may happen that PostgreSQL is restarted while pgloader is running, or that for some other reason we lose the connection to the server, and in most cases we know how to gracefully reconnect and retry, so just do so. Fixes #546 initial report.	2017-06-29 12:34:34 +02:00
Dimitri Fontaine	25e5ea9ac3	Refactor error handling in complete-pgsql-database. Given new SQLite test case from issue #563 we see that pgloader doesn't handle errors gracefully in post-copy stage. That's because the API were not properly defined, we should use pgsql-execute-with-timing rather than other construct here, because it allows the "on error resume next" behavior we want with after load DDL statements. See #563.	2017-06-08 12:09:11 +02:00
Dimitri Fontaine	1023577f50	Review internal database migration logic. Many options are now available to pgloader users, including short cuts that where not defined clearly enough. That could result in stupid things being done at times. In particular, when picking the "data only" option then indexes are not to be dropped before loading the data, but pgloader would still try and create them again at the end of the load, because the option that controls that behavior default to true and is not impacted by the "data only" choice. In this patch we review the logic and ensure it's applied in the same fashion in the different phases of the database migration: preparation, copying, rebuilding of indexes and completion of the database model. See also `96b2af6b2a` where we began fixing oddities but didn't go far enough.	2017-02-26 14:48:36 +01:00
Dimitri Fontaine	8da09d7bed	Log PostgreSQL Catalog queries at SQL log level. See #476 where it would have been helpful to see the PostgreSQL catalog queries with `--log-min-messages sql` in the bug report. Also more generally useful.	2017-01-10 21:12:34 +01:00
Dimitri Fontaine	381ba18b50	Add a new log level: SQL. This sits between NOTICE and INFO, allowing to have a complete log of the SQL queries sent to the server while avoiding the very verbose trafic of the DEBUG log level. See #498.	2017-01-03 22:27:17 +01:00
Dimitri Fontaine	2d47c4f0f5	Use internal catalog when loading from files. Replace the ad-hoc code that was used before in the load from file code path to use our full internal catalog representation, and adjust APIs to that end. The goal is to use catalogs everywhere in the PostgreSQL target API and allowing to process reason explicitely about source and target catalogs, see #400 for the main use case.	2016-08-05 11:42:06 +02:00
Dimitri Fontaine	2aedac7037	Improve our internal catalog representation. First, add index and foreign keys to the list of objects supported by the shared catalog facility, where is was only found in the pgsql schema specific package for historical raisons. Then also add to our catalog internal structures the notion of a trigger and a stored procedure, allowing for cleaner advanced default values support in the MySQL cast functions. Once we now have a proper and complete catalog, review the pgsql module DDL output function in terms of the catalog and rewrite the schema creation support so that it takes direct benefit of our internal catalogs representation. In passing, clean-up the code organisation of the pgsql target support module to be easier to work with. Next step consists of getting rid of src/pgsql/queries.lisp: this facility should be replaced by the usage of a target catalog that we fetch the usual way, thanks to the new src/pgsql/pgsql-schema.lisp file and list-all-* functions. That will in turn allow for an explicit step of merging the pre-existing PostgreSQL catalog when it's been created by other tools than pgloader, that is when migrating with the help of an ORM. See #400 for details.	2016-08-01 23:14:58 +02:00

14 Commits