pgloader

mirror of https://github.com/dimitri/pgloader.git synced 2025-08-08 07:16:58 +02:00

Author	SHA1	Message	Date
Dimitri Fontaine	501cbed745	Quote database name in ALTER DATABASE "..." SET search_path TO Fixes #933.	2019-05-12 00:50:59 +02:00
Dimitri Fontaine	f28f8e577d	Review log-level for stored procedures. Some MySQL schema level features (on update current_timestamp) are migrated to stored procedures and triggers. We would log the CREATE PROCEDURE statements as LOG level entries instead of SQL level entries, most likely a stray devel/debug choice.	2019-01-08 22:44:07 +01:00
Dimitri Fontaine	8112a9b54f	Improve Citus Distribution Support. With this patch it's now actually possible to backfill the data on the fly when using the "distribute" new commands. The schema is modified to add the distribution key where specified, and changes to the primary and foreign keys happen automatically. Then a JOIN is generated to get the data directly during the COPY streaming to the Citus cluster.	2018-10-16 18:53:41 +02:00
Dimitri Fontaine	fc3a1949f7	Add support for PostgreSQL as a source database. It's now possible to use pgloader to migrate from PostgreSQL to PostgreSQL. That might be useful for several reasons, including applying user defined cast rules at COPY time, or just moving from an hosted solution to another.	2018-08-20 11:09:52 +02:00
Dimitri Fontaine	a0bac47101	Refrain from TRUNCAT'ing an empty list of tables. Fixed #789.	2018-06-15 17:46:31 +02:00
Dimitri Fontaine	d4dc4499a8	Add schema migration support for Redshift as a target. Redshift looks like a very old PostgreSQL (8.0.2) with some extra features and a very limited selection of data types. In this patch we parse the PostgreSQL version() function output and automatically determine if we're connected to Redshift. When connected to Redshift, we then dumb-down our target catalogs to the subset of data types that Redshift actually does support. Also, some catalog queries can't be done in Redshift, and 8.0 didn't have fully compliant VALUES statement, so we use a temporary table in places where we used to use SELECT ... FROM (VALUES(...)) in pgloader. COPYing data to Redshift isn't possible with just this set of changes, because Redshift also don't support the COPY FROM STDIN form. COPY sources are limited, and another patch will have to be cooked to prepare the data from pgloader into a format and location that Redshift knows how to handle. At least, it's possible to migrate a database schema to Redshift already.	2018-05-19 19:16:58 +02:00
Dimitri Fontaine	48af01dbbc	Fix implementation of foreign keys in data only mode. In data-only mode, the foreign keys parameter (which defaults to True) means something special: we remove the fkey definitions prior to the data only load then re-install the fkeys. This got broken in a previous commit, the WITH clause option being processed like the other DDL ones that only make sense when creating the schema. While fixing the setting in copy-database, we have to also fix a nesting bug in complete-pgsql-database that would prevent fkey to be installed again at the end of the load. This patch not only fix that choice, but also review the implementation of the drop-pgsql-fkeys support function to use more modern internal API, preparing a list of SQL statements to be sent to the psql-execute level. Fixes #745.	2018-02-19 22:07:43 +01:00
Dimitri Fontaine	4612e68435	Implement support for new casting rules guards and actions. Namely the actions are “keep extra” and “drop extra” and the casting rule guard is “with extra on update current timestamp”. Having support for those elements in the casting rules allow such a definition as the following: type timestamp with extra on update current timestamp to "timestamp with time zone" drop extra The effect of such as cast rule would be to ignore the MySQL extra definition and then refrain pgloader from creating the PostgreSQL triggers that implement the same behavior. Fix #735.	2018-01-31 15:17:05 +01:00
Dimitri Fontaine	db7a91d6c4	Add the MySQL target schema to the search_path. In the next release, pgloader defaults to targetting a new schema named the same as the MySQL database, because that's what makes more sense. But people are used to having 'public' in the search_path and everything in there. So when creating our target schema, when migrating from MySQL, arrange it so that the new schema is in the search_path by issuing a command like: ALTER DATABASE plop SET search_path TO public, f1db; And make this command visible in verbose (NOTICE) mode too, so that user can see what happens. Fix #654. I think.	2017-11-02 12:40:21 +01:00
Dimitri Fontaine	72c58306ba	Fix the previous fix. See #614. Again. Should be ok now.	2017-08-25 01:56:34 +02:00
Dimitri Fontaine	f20a5a0667	Fix schema name comparing with quoted schema names. In the previous commit we introduced support for database names including spaces, which means that by default pgloader creates a target schema in PostgreSQL with a space in its name. That works well as soon as you always double-quote the schema name, which pgloader does. Now, in our internal catalogs, we keep the schema name double-quoted. And when comparing that schema names with quotes to the raw schema name from PostgreSQL, they won't match, and pgloader tries to create the schema again: ERROR Database error 42P06: schema "my sql" already exists Fix the comparing to compare unquoted schema name, fix #614 again: the previous fix would only work the first time.	2017-08-25 01:47:49 +02:00
Dimitri Fontaine	1f242cd29e	Fix comment support to schema qualify target tables.	2017-08-23 11:26:08 +02:00
Dimitri Fontaine	03a8d57a50	Review --verbose log message. The verbosity is not that easy to adjust. Remove useless messages and add a new one telling when the COPY of a table is done. As we might have to wait for some time for indexes being built. keep the CREATE INDEX lines. Also keep the ALTER TABLE both for primary keys and foreign keys, again because the user might have to wait for quite some time.	2017-08-21 15:27:13 +02:00
Dimitri Fontaine	8405c331a9	Error handling improvements for PostgreSQL schema. In the complete PostgreSQL schema step, an error would be logged as you expect but poorly handled: it would have the whole transaction rolled back, meaning that a single Primary Key definition failure would cancel all the others, plus the foreign keys, and also the triggers and comments. It happens that other systems allow a primary column to contain NULL values, which is forbidden in the standard and enforced by PostgreSQL, so that's not a theoritical concern here.	2017-07-05 17:53:33 +02:00
Dimitri Fontaine	60c1146e18	Assorted fixes. Refrain from killing the Common Lisp image when doing interactive regression testing if we typo'ed the regression test file name...	2017-06-29 12:35:40 +02:00
Dimitri Fontaine	5faf8605ce	Fix corner cases and how we log them. In the prepare-pgsql-database method we were logging too much details, such as DDL warnings on if-not-exists for successful queries. And those logs are to be found in PostgreSQL server logs anyway. Also fix trying to create or drop a "nil" schema.	2017-06-17 18:16:18 +02:00
Dimitri Fontaine	25e5ea9ac3	Refactor error handling in complete-pgsql-database. Given new SQLite test case from issue #563 we see that pgloader doesn't handle errors gracefully in post-copy stage. That's because the API were not properly defined, we should use pgsql-execute-with-timing rather than other construct here, because it allows the "on error resume next" behavior we want with after load DDL statements. See #563.	2017-06-08 12:09:11 +02:00
Dimitri Fontaine	320a545533	Fix SQL types creation: consider views too. When migrating views from e.g. MySQL it is necessary to consider the user defined SQL types (ENUMs) those views might be using.	2016-12-18 19:31:21 +01:00
Dimitri Fontaine	2dc733c4d6	Fix corner case in creating indexes again. When the option "drop indexes" is in use in loading data from a file, we collect the indexes from the PostgreSQL catalogs and then issue DROP commands against them before the load, then CREATE commands when it's done. The CREATE is done in parallel, and we create an lparallel kernel for that. The kernel must have a worker-count of at least 1, and we where not considering the case of 0 indexes on the target table. Fix #484.	2016-11-20 17:17:15 +01:00
Dimitri Fontaine	5b6adb02b0	Implement and use DROP ... IF EXISTS. In cases where we have a WITH include drop option, we are generating lots of SQL DROP statements. We may be running an empty target database or in other situations where the target object of the DROP command might not exists. Add support for that case.	2016-09-10 18:01:04 +02:00
Dimitri Fontaine	f2dcf982d8	Fix stats collections in some cases. Calling a -with-timing from within a with-stats-collection macro is redundant and will have the numbers counted twice. Which in this case didn't happen because the stats label was manually copied, but borked with a typo in one copy.	2016-08-28 20:29:53 +02:00
Dimitri Fontaine	a86a606d55	Improve existing PostgreSQL database handling. When loading data into an existing PostgreSQL catalog, we DROP the indexes for better performance of the data loading. Some of the indexes are UNIQUE or even PRIMARY KEYS, and some FOREIGN KEYS might depend on them in the PostgreSQL dependency tracking of the catalog. We used to use the CASCADE option when dropping the indexes, which hides a bug: if we exclude from the load tables with foreign keys pointing to tables we target, then we would DROP those foreign keys because of the CASCADE option, but fail to install them again at the end of the load. To prevent that from happening, pgloader now query the PostgreSQL pg_depend system catalog to list the “missing” foreign keys and add them to our internal catalog representation, from which we know to DROP then CREATE the SQL object at the proper times. See #400 as this was an oversight in fixing this issue.	2016-08-10 22:02:06 +02:00
Dimitri Fontaine	43261e0016	Fix double-counting of fkeys in stats reports.	2016-08-08 21:09:15 +02:00
Dimitri Fontaine	70572a2ea7	Implement support for existing target databases. Also known as the ORM case, it happens that other tools are used to create the target schema. In that case pgloader job is to fill in the exiting target tables with the data from the source tables. We still focus on load speed and pgloader will now DROP the constraints (Primary Key, Unique, Foreign Keys) and indexes before running the COPY statements, and re-install the schema it found in the target database once the data load is done. This behavior is activated when using the “create no tables” option as in the following test-case setup: with create no tables, include drop, truncate Fixes #400, for which I got a test-case to play with!	2016-08-06 20:19:15 +02:00

24 Commits