It used to be that our casting rules mechanism would allow for matching
unsigned data types only, and we sometimes have a need to do special
behavior on signed data types.
In particular, a signed bigint(20) in MySQL has the same values range as a
PostgreSQL bigint, so we don't need to target a numeric in that case. It's
only when the bigint is unsigned that we need to target a numeric.
In passing update some of the default casting rules documentation to match
the code.
Fix#982.
The support for drop default in (user defined) casting rules was completely
broken in SQLite, because the code didn't even bother looking at what's
returning after applying the casting rules.
This patch fixes the code so that is uses the pgcol instance's default
value, as per after applying casting rules. The bug also existed in a subtle
form for MySQL and MS SQL, but would only show up there when the default
value is spelled using a known variation of “current timestamp”.
Namely the actions are “keep extra” and “drop extra” and the casting rule
guard is “with extra on update current timestamp”. Having support for those
elements in the casting rules allow such a definition as the following:
type timestamp with extra on update current timestamp
to "timestamp with time zone" drop extra
The effect of such as cast rule would be to ignore the MySQL extra
definition and then refrain pgloader from creating the PostgreSQL triggers
that implement the same behavior.
Fix#735.
MySQL allows using unsigned data types and pgloader should then target a
signed type of a larger capacity so that values can fit. For example, the
data definition “smallint(5) unsigned” should be casted to “integer”.
This patch allows user defined cast rules to be written against “unsigned”
data types as per their MySQL catalog representation.
See #678.
The individual CAST decisions are visible in the CREATE TABLE statements
that are logged a moment later. Also, calling `format-create-sql' on a
column definition that's not finished to be casted will process default
values before they get normalized, and issue WARNING to the poor user.
Not helpful. Bye.
First, add index and foreign keys to the list of objects supported by
the shared catalog facility, where is was only found in the pgsql schema
specific package for historical raisons.
Then also add to our catalog internal structures the notion of a trigger
and a stored procedure, allowing for cleaner advanced default values
support in the MySQL cast functions.
Once we now have a proper and complete catalog, review the pgsql module
DDL output function in terms of the catalog and rewrite the schema
creation support so that it takes direct benefit of our internal
catalogs representation.
In passing, clean-up the code organisation of the pgsql target support
module to be easier to work with.
Next step consists of getting rid of src/pgsql/queries.lisp: this
facility should be replaced by the usage of a target catalog that we
fetch the usual way, thanks to the new src/pgsql/pgsql-schema.lisp file
and list-all-* functions.
That will in turn allow for an explicit step of merging the pre-existing
PostgreSQL catalog when it's been created by other tools than pgloader,
that is when migrating with the help of an ORM. See #400 for details.
Use case: Django dissuades setting NULL “on string-based fields […]
because empty string values will always be stored as empty strings, not
as NULL. If a string-based field has null=True, that means it has two
possible values for »no data«: NULL, and the empty string. In most
cases, it’s redundant to have two possible values for »no data«; the
Django convention is to use the empty string, not NULL.”.
pgloader already supports custom transformations which can be used to
replace NULL values in string-based columns with empty strings. Setting
NOT NULL constraint on those columns could possibly be achieved by
running a database query to extract their names and then generating
relevant ALTER TABLE statements, but a cast option in pgloader is a more
convenient way.
This fixes#141 again when users are forcing MySQL bigint(20) into
PostgreSQL bigint types so that foreign keys can be installed. To this
effect, as cast rule such as the following is needing:
cast type bigint when (= 20 precision) to bigint drop typemod
Before this patch, this user provided cast rule would also match against
MySQL types "with extra auto_increment", and it should not.
If you're having the problem that this patch fixes on an older pgloader
that you can't or won't upgrade, consider the following user provided
set of cast rules to achieve the same effect:
cast type bigint with extra auto_increment to bigserial drop typemod,
type bigint when (= 20 precision) to bigint drop typemod
In a previous commit the typemod matching code had been broken, and we
failed to notice that until now. Thanks to bug report #322 we just got
the memo...
Add a test case in the local-only MySQL database.
The regression testing facilities should be improved to be able to test
a full database, and then to dynamically create said database from code
or something to ease test coverage of those cases.
In order to share more code in between the different source types,
finally have a go at the quite horrible mess of anonymous data
structures floating around.
Having a catalog and schema instances not only allows for code cleanup,
but will also allow to implement some bug fixes and wishlist items such
as mapping tables from a schema to another one.
Also, supporting database sources having a notion of "schema" (in
between "catalog" and "table") should get easier, including getting
on-par with MySQL in the MS SQL support (materialized views has been
asked for already).
See #320, #316, #224 for references and a notion of progress being made.
In passing, also clean up the copy-databases methods for database source
types, so that they all use a fetch-metadata generic function and a
prepare-pgsql-database and a complete-pgsql-database generic function.
Actually, a single method does the job here.
The responsibility of introspecting the source to populate the internal
catalog/schema representation is now held by the fetch-metadata generic
function, which in turn will call the specialized versions of
list-all-columns and friends implementations. Once the catalog has been
fetched, an explicit CAST call is then needed before we can continue.
Finally, the fields/columns/transforms slots in the copy objects are
still being used by the operative code, so the internal catalog
representation is only used up to starting the data copy step, where the
copy class instances are then all that's used.
This might be refactored again in a follow-up patch.
Commit 598c860cf5 broke user defined
casting rules by interning "precision" and "scale" in the
pgloader.user-symbols package: those symbols need to be found in the
pgloader.transforms package instead.
Luckily enough the infrastructure to do that was already in place for
cl:nil.