Commit Graph

27 Commits

Author SHA1 Message Date
adrian
3da4422fb5 few typos in comments/strings 2023-05-01 10:18:30 +02:00
noctarius aka Christoph Engelbert
644f2617e7
Added support for sequences with minvalue defined (#1429)
When a sequence is defined with a minimum value, and the sequence to migrate is empty, the setval fails due to 1 being the default (which may be lower than the defined minimum value)

Signed-off-by: Christoph Engelbert (noctarius) <me@noctarius.com>
2022-09-12 16:17:49 +02:00
Dimitri Fontaine
3b5c29b030 Attempt to fix foreign-key creation to tables that have been filtered out.
See #1016 where we try to build the DDL for a foreign key that references
tables that are not found in our catalogs. We should probably just ignore
those foreign keys, as we might have a partial load to implement.
2020-02-12 21:50:18 +01:00
Dimitri Fontaine
501cbed745 Quote database name in ALTER DATABASE "..." SET search_path TO
Fixes #933.
2019-05-12 00:50:59 +02:00
Dimitri Fontaine
f28f8e577d Review log-level for stored procedures.
Some MySQL schema level features (on update current_timestamp) are migrated
to stored procedures and triggers. We would log the CREATE PROCEDURE
statements as LOG level entries instead of SQL level entries, most likely a
stray devel/debug choice.
2019-01-08 22:44:07 +01:00
Dimitri Fontaine
8112a9b54f Improve Citus Distribution Support.
With this patch it's now actually possible to backfill the data on the fly
when using the "distribute" new commands. The schema is modified to add the
distribution key where specified, and changes to the primary and foreign
keys happen automatically. Then a JOIN is generated to get the data directly
during the COPY streaming to the Citus cluster.
2018-10-16 18:53:41 +02:00
Dimitri Fontaine
fc3a1949f7 Add support for PostgreSQL as a source database.
It's now possible to use pgloader to migrate from PostgreSQL to PostgreSQL.
That might be useful for several reasons, including applying user defined
cast rules at COPY time, or just moving from an hosted solution to another.
2018-08-20 11:09:52 +02:00
Dimitri Fontaine
a0bac47101 Refrain from TRUNCAT'ing an empty list of tables.
Fixed #789.
2018-06-15 17:46:31 +02:00
Dimitri Fontaine
d4dc4499a8 Add schema migration support for Redshift as a target.
Redshift looks like a very old PostgreSQL (8.0.2) with some extra features
and a very limited selection of data types. In this patch we parse the
PostgreSQL version() function output and automatically determine if we're
connected to Redshift.

When connected to Redshift, we then dumb-down our target catalogs to the
subset of data types that Redshift actually does support.

Also, some catalog queries can't be done in Redshift, and 8.0 didn't have
fully compliant VALUES statement, so we use a temporary table in places
where we used to use SELECT ... FROM (VALUES(...)) in pgloader.

COPYing data to Redshift isn't possible with just this set of changes,
because Redshift also don't support the COPY FROM STDIN form. COPY sources
are limited, and another patch will have to be cooked to prepare the data
from pgloader into a format and location that Redshift knows how to handle.

At least, it's possible to migrate a database schema to Redshift already.
2018-05-19 19:16:58 +02:00
Dimitri Fontaine
48af01dbbc Fix implementation of foreign keys in data only mode.
In data-only mode, the foreign keys parameter (which defaults to True) means
something special: we remove the fkey definitions prior to the data only
load then re-install the fkeys.

This got broken in a previous commit, the WITH clause option being processed
like the other DDL ones that only make sense when creating the schema. While
fixing the setting in copy-database, we have to also fix a nesting bug in
complete-pgsql-database that would prevent fkey to be installed again at the
end of the load.

This patch not only fix that choice, but also review the implementation of
the drop-pgsql-fkeys support function to use more modern internal API,
preparing a list of SQL statements to be sent to the psql-execute level.

Fixes #745.
2018-02-19 22:07:43 +01:00
Dimitri Fontaine
4612e68435 Implement support for new casting rules guards and actions.
Namely the actions are “keep extra” and “drop extra” and the casting rule
guard is “with extra on update current timestamp”. Having support for those
elements in the casting rules allow such a definition as the following:

      type timestamp with extra on update current timestamp
        to "timestamp with time zone" drop extra

The effect of such as cast rule would be to ignore the MySQL extra
definition and then refrain pgloader from creating the PostgreSQL triggers
that implement the same behavior.

Fix #735.
2018-01-31 15:17:05 +01:00
Dimitri Fontaine
db7a91d6c4 Add the MySQL target schema to the search_path.
In the next release, pgloader defaults to targetting a new schema named the
same as the MySQL database, because that's what makes more sense. But people
are used to having 'public' in the search_path and everything in there.

So when creating our target schema, when migrating from MySQL, arrange it so
that the new schema is in the search_path by issuing a command like:

  ALTER DATABASE plop SET search_path TO public, f1db;

And make this command visible in verbose (NOTICE) mode too, so that user can
see what happens.

Fix #654. I think.
2017-11-02 12:40:21 +01:00
Dimitri Fontaine
72c58306ba Fix the previous fix.
See #614. Again. Should be ok now.
2017-08-25 01:56:34 +02:00
Dimitri Fontaine
f20a5a0667 Fix schema name comparing with quoted schema names.
In the previous commit we introduced support for database names including
spaces, which means that by default pgloader creates a target schema in
PostgreSQL with a space in its name. That works well as soon as you always
double-quote the schema name, which pgloader does.

Now, in our internal catalogs, we keep the schema name double-quoted. And
when comparing that schema names with quotes to the raw schema name from
PostgreSQL, they won't match, and pgloader tries to create the schema again:

  ERROR Database error 42P06: schema "my sql" already exists

Fix the comparing to compare unquoted schema name, fix #614 again: the
previous fix would only work the first time.
2017-08-25 01:47:49 +02:00
Dimitri Fontaine
1f242cd29e Fix comment support to schema qualify target tables. 2017-08-23 11:26:08 +02:00
Dimitri Fontaine
03a8d57a50 Review --verbose log message.
The verbosity is not that easy to adjust. Remove useless messages and add a
new one telling when the COPY of a table is done. As we might have to wait
for some time for indexes being built. keep the CREATE INDEX lines. Also
keep the ALTER TABLE both for primary keys and foreign keys, again because
the user might have to wait for quite some time.
2017-08-21 15:27:13 +02:00
Dimitri Fontaine
8405c331a9 Error handling improvements for PostgreSQL schema.
In the complete PostgreSQL schema step, an error would be logged as you
expect but poorly handled: it would have the whole transaction rolled back,
meaning that a single Primary Key definition failure would cancel all the
others, plus the foreign keys, and also the triggers and comments.

It happens that other systems allow a primary column to contain NULL values,
which is forbidden in the standard and enforced by PostgreSQL, so that's not
a theoritical concern here.
2017-07-05 17:53:33 +02:00
Dimitri Fontaine
60c1146e18 Assorted fixes.
Refrain from killing the Common Lisp image when doing interactive regression
testing if we typo'ed the regression test file name...
2017-06-29 12:35:40 +02:00
Dimitri Fontaine
5faf8605ce Fix corner cases and how we log them.
In the prepare-pgsql-database method we were logging too much details, such
as DDL warnings on if-not-exists for successful queries. And those logs are
to be found in PostgreSQL server logs anyway.

Also fix trying to create or drop a "nil" schema.
2017-06-17 18:16:18 +02:00
Dimitri Fontaine
25e5ea9ac3 Refactor error handling in complete-pgsql-database.
Given new SQLite test case from issue #563 we see that pgloader doesn't
handle errors gracefully in post-copy stage. That's because the API were not
properly defined, we should use pgsql-execute-with-timing rather than other
construct here, because it allows the "on error resume next" behavior we
want with after load DDL statements.

See #563.
2017-06-08 12:09:11 +02:00
Dimitri Fontaine
320a545533 Fix SQL types creation: consider views too.
When migrating views from e.g. MySQL it is necessary to consider the
user defined SQL types (ENUMs) those views might be using.
2016-12-18 19:31:21 +01:00
Dimitri Fontaine
2dc733c4d6 Fix corner case in creating indexes again.
When the option "drop indexes" is in use in loading data from a file, we
collect the indexes from the PostgreSQL catalogs and then issue DROP
commands against them before the load, then CREATE commands when it's
done.

The CREATE is done in parallel, and we create an lparallel kernel for
that. The kernel must have a worker-count of at least 1, and we where
not considering the case of 0 indexes on the target table.

Fix #484.
2016-11-20 17:17:15 +01:00
Dimitri Fontaine
5b6adb02b0 Implement and use DROP ... IF EXISTS.
In cases where we have a WITH include drop option, we are generating
lots of SQL DROP statements. We may be running an empty target database
or in other situations where the target object of the DROP command might
not exists. Add support for that case.
2016-09-10 18:01:04 +02:00
Dimitri Fontaine
f2dcf982d8 Fix stats collections in some cases.
Calling a -with-timing from within a with-stats-collection macro is
redundant and will have the numbers counted twice. Which in this case
didn't happen because the stats label was manually copied, but borked
with a typo in one copy.
2016-08-28 20:29:53 +02:00
Dimitri Fontaine
a86a606d55 Improve existing PostgreSQL database handling.
When loading data into an existing PostgreSQL catalog, we DROP the
indexes for better performance of the data loading. Some of the indexes
are UNIQUE or even PRIMARY KEYS, and some FOREIGN KEYS might depend on
them in the PostgreSQL dependency tracking of the catalog.

We used to use the CASCADE option when dropping the indexes, which hides
a bug: if we exclude from the load tables with foreign keys pointing to
tables we target, then we would DROP those foreign keys because of the
CASCADE option, but fail to install them again at the end of the load.

To prevent that from happening, pgloader now query the PostgreSQL
pg_depend system catalog to list the “missing” foreign keys and add them
to our internal catalog representation, from which we know to DROP then
CREATE the SQL object at the proper times.

See #400 as this was an oversight in fixing this issue.
2016-08-10 22:02:06 +02:00
Dimitri Fontaine
43261e0016 Fix double-counting of fkeys in stats reports. 2016-08-08 21:09:15 +02:00
Dimitri Fontaine
70572a2ea7 Implement support for existing target databases.
Also known as the ORM case, it happens that other tools are used to
create the target schema. In that case pgloader job is to fill in the
exiting target tables with the data from the source tables.

We still focus on load speed and pgloader will now DROP the
constraints (Primary Key, Unique, Foreign Keys) and indexes before
running the COPY statements, and re-install the schema it found in the
target database once the data load is done.

This behavior is activated when using the “create no tables” option as
in the following test-case setup:

  with create no tables, include drop, truncate

Fixes #400, for which I got a test-case to play with!
2016-08-06 20:19:15 +02:00