The code expects the keyword :auto-increment rather than a string nowadays
in order to process an extra column bits of information as meaning that we
want to cast to a serial/bigserial datatype.
In a previous commit we re-used the package name pgloader.copy for the now
separated implementation of the COPY protocol, but this package was already
in use for the implementation of the COPY file format as a pgloader source.
Oops.
And CCL was happily doing its magic anyway, so that I've been blind to the
problem.
To fix, rename the new package pgloader.pgcopy, and to avoid having to deal
with other problems of the same kind in the future, rename every source
package pgloader.source.<format>, so that we now have pgloader.source.copy
and pgloader.pgcopy, two visibily different packages to deal with.
This light refactoring came with a challenge tho. The split in between the
pgloader.sources API and the rest of the code involved some circular
depencendies in the namespaces. CL is pretty flexible here because it can
reload code definitions at runtime, but it was still a mess. To untangle it,
implement a new namespace, the pgloader.load package, where we can use the
pgloader.sources API and the pgloader.connection and pgloader.pgsql APIs
too.
A little problem gave birth to quite a massive patch. As it happens when
refactoring and cleaning-up the dirt in any large enough project, right?
See #748.
Several places in the code are involved to deal with the default values from
MS SQL. The catalog query is dealing with strange quoting rules on the
source side and used to fill in directly the PostgreSQL expected value. But
then the quoting of a function call wasn't properly handled.
Rather than coping with the quoting rules here, have the catalog query
return a pgloader specific placeholder "GENERATE_UUID". Then the MS SQL
specific code can normalize that to the symbol :generate_uuid. Then the
generic PostgreSQL DDL code can implement the proper replacement for that
symbol, not having to know where it comes from.
Fix#742.
The support for drop default in (user defined) casting rules was completely
broken in SQLite, because the code didn't even bother looking at what's
returning after applying the casting rules.
This patch fixes the code so that is uses the pgcol instance's default
value, as per after applying casting rules. The bug also existed in a subtle
form for MySQL and MS SQL, but would only show up there when the default
value is spelled using a known variation of “current timestamp”.
Previously to this patch, pgloader wouldn't care about which schema it
creates extra types in. Extra types are mainly ENUM and SET support from
MySQL. Now, pgloader creates those extra PostgreSQL ENUM types in the same
schema as the table using them, which is a more sound default.
The previous commit makes it possible to convert typemods for various
text types. In MSSQL it's possible to create a column like varchar(max).
Internally this is reported as varchar(-1) which results in a CREATE
TABLE statement that contains e.g. varchar(-1).
This patch drops the typemod if it's -1 (max).
It's based on Dimitris patch slightly modified by myself.
If a custom CAST rule is defined (e.g CAST type varchar to varchar) the
original typemod get's lost. This commit is based on Dimitris patch from
571 and adds typmod support for "char", "nchar", "varchar", "nvarchar"
and "binary".
The previous coding would discard any work done at the apply-casting-rules
step when adding source specific smarts about handling default, because of
what looks like negligence and bad tests. A test case scenario exists but
was not exercized :(
Fix that by defaulting the default value to the one given back at the
apply-casting-rules stage, where we apply the "drop default" clause.
Given a test case and some reading of the FreeTDS source code, it appears
that the XML data type is sent on the wire as (unicode) text. This patch
makes pgloader aware of that and also revisit the choice of casting XML to
PostgreSQL XML data type (thanks to the test case where we see it just works
without surprise).
Fix#503.
Add a cast rule to support tinyint being an “identity” type in MS SQL, which
means using a sequence to derive its values from. We didn't address the
whole MS SQL integer type tower here, and suspect we will have to add more
in the future.
Fix#528 where I could have access to a test case and easily reproduce the
bug, thanks!
Availability of a test case for MS SQL allows to make progress on this
limitation and add support to the smalldatetime data type. It is
converted server-side with the same CONVERT expression as the longer
datetime datatype.
Fixes#431.
When the column is known to be non-nullable, refrain from adding a null
default value to it. This also fixes the case of casting from an
[int] IDENTITY(1,1) NOT NULL
That otherwise did get transformed into a
bigserial not null default NULL
Causing then the error
Database error 42601: multiple default values specified for column ... of table ...
First, add index and foreign keys to the list of objects supported by
the shared catalog facility, where is was only found in the pgsql schema
specific package for historical raisons.
Then also add to our catalog internal structures the notion of a trigger
and a stored procedure, allowing for cleaner advanced default values
support in the MySQL cast functions.
Once we now have a proper and complete catalog, review the pgsql module
DDL output function in terms of the catalog and rewrite the schema
creation support so that it takes direct benefit of our internal
catalogs representation.
In passing, clean-up the code organisation of the pgsql target support
module to be easier to work with.
Next step consists of getting rid of src/pgsql/queries.lisp: this
facility should be replaced by the usage of a target catalog that we
fetch the usual way, thanks to the new src/pgsql/pgsql-schema.lisp file
and list-all-* functions.
That will in turn allow for an explicit step of merging the pre-existing
PostgreSQL catalog when it's been created by other tools than pgloader,
that is when migrating with the help of an ORM. See #400 for details.
Having been given a test instance of a MS SQL database allows to quickly
fix a series of assorted bugs related to schema handling of MS SQL
databases. As it's the only source with a proper notion of schema that
pgloader supports currently, it's not a surprise we had them.
Fix#343. Fix#349. Fix#354.
In order to share more code in between the different source types,
finally have a go at the quite horrible mess of anonymous data
structures floating around.
Having a catalog and schema instances not only allows for code cleanup,
but will also allow to implement some bug fixes and wishlist items such
as mapping tables from a schema to another one.
Also, supporting database sources having a notion of "schema" (in
between "catalog" and "table") should get easier, including getting
on-par with MySQL in the MS SQL support (materialized views has been
asked for already).
See #320, #316, #224 for references and a notion of progress being made.
In passing, also clean up the copy-databases methods for database source
types, so that they all use a fetch-metadata generic function and a
prepare-pgsql-database and a complete-pgsql-database generic function.
Actually, a single method does the job here.
The responsibility of introspecting the source to populate the internal
catalog/schema representation is now held by the fetch-metadata generic
function, which in turn will call the specialized versions of
list-all-columns and friends implementations. Once the catalog has been
fetched, an explicit CAST call is then needed before we can continue.
Finally, the fields/columns/transforms slots in the copy objects are
still being used by the operative code, so the internal catalog
representation is only used up to starting the data copy step, where the
copy class instances are then all that's used.
This might be refactored again in a follow-up patch.
The default for MS SQL float types is to only have a precision defined,
as described in https://msdn.microsoft.com/en-us/library/ms173773.aspx,
but the pgloader code didn't know what to do with a float without scale.
This is a blind patch given that I couldn't CREATE TABLE as per the bug
report to try and see by myself what's happening. Better have some tests
going on though.
Now it's possible to parse a command to load data from MS SQL. The
parser was until now parsing all database URI within the same common
rule and that isn't possible anymore if we want to distinguish in
between source database right from the parser, which we actually want to
do.
This patch also implement in-passing fixes all over the place, including
the transformation function float-to-string that only happened to work
on double-float data.
Converting the table definitions (with type casting) seems to work. Also
did experiment a little with actuallt fetching some data... and had to
edit the cl-mssql driver, which is temporarily monkey patched.