It used to be that our casting rules mechanism would allow for matching
unsigned data types only, and we sometimes have a need to do special
behavior on signed data types.
In particular, a signed bigint(20) in MySQL has the same values range as a
PostgreSQL bigint, so we don't need to target a numeric in that case. It's
only when the bigint is unsigned that we need to target a numeric.
In passing update some of the default casting rules documentation to match
the code.
Fix#982.
Implement a generic-function API to discover the source database schema and
populate pgloader internal version of the catalogs. Cut down three copies of about
the same code-path down to a single shared one, thanks to applying some amount
of OOP to the code.
Clozure doesn't have the CP866 encoding that the DBF files are using, and
then PostgreSQL 9.6 doesn't have "create schema if not exists", which makes
the tests fail on Travis.
The cl-db3 lib just got improvements for new dbase file types and field
types, reflect those in pgloader.
Also, cl-db3 now can read the encoding of the file (language driver)
directly in the header, meaning we can rely on that metadata by default, and
only override it when the users tells us to.
See #961.
Before that it was necessary to install a function in the lisp environment
either in the source itself in src/utils/transforms.lisp, or in a lisp file
loaded with --load-lisp-file (or -l for shorts).
While this could be good enough, sometimes a very simple combination of
existing features is required to transform a function and so doing some
level of lisp coding directly in the load command is a nice to have.
Fixes#961.
Fixed file formats might contain an header line with column names and a hint
of the size of each column. While it might be a long shot that we can
acutally use that as a proper fixed-format specification, this patch
implements a guess mode that also outputs the parsed header.
In case when the parsing is wrong in some level of details, it might
actually be a good start to copy/paste from the command output and go from
there.
Fixes#958.
The URL of the test case source has changed. Use the new one. Also set the
encoding properly, the client_encoding trick has been deprecated for awhile
now, as pgloader only talks to Postgres in UTF-8.
We migrate bit(xx) to the same PostgreSQL datatype bit(xx) where in Postgres
we can use bitstring as documented at the following URL. In particular the
COPY syntax accepts the notation Xabcd for the values, which is quite nice
when MySQL sends the data to us a a byte vector:
https://www.postgresql.org/docs/current/datatype-bit.htmlFixes#943.
The casting support for DB3 was hand-crafted and didn't get upgraded to
using the current CAST grammar and facilities, for no other reasons than
lack of time and interest. It so happens what implementing it now fixes two
bug reports.
Bug #938 is about conversion defaulting to "not null" column, and that's due
to the usage of the internal pgloader catalogs where the target column's
nullable field is NIL by default, which doesn't make much sense. With
support for user-defined casting rules, the default is nullable columns, so
that's kind of a free fix.
Fixes#927.
Fixes#938.
In some cases when migrating from MySQL we want to transform data from
binary representation to an hexadecimal number. One such case is going from
MySQL binary(16) to PostgreSQL UUID data type.
Fixes#904.
When using a CSV header, we might find fields in a different order than the
target table columns, and maybe not all of the fields are going to be read.
Take account of the header we read rather than expecting the header to look
like the target table definition.
Fix#888.
This allows creating tables in any target tablespace rather than the default
one, and is supported for the various sources having support for the ALTER
TABLE clause already.
Materialized views without an explicit schema name are supported, but then
would raise an error when trying to use destructuring-bind on a string
rather than the (cons schema-name table-name). This patch fixes that.
This gives a default "null if" option to all the input columns at once, and
it's still possible to override the default per column.
In passing, fix project-fields declarations that SBCL now complains about
when they're not true, such as declaring a vector when we might have :null
or nil. As a result, remove the (declare (optimize speed)) in the generated
field processing code.
The code emitted by pgloader to transform input fields into PostgreSQL
column values was using too many optimization declarations, some of them
that SBCL failed to follow through for lack of type marking in the generated
code.
As SBCL doesn't have enough information to be optimizing anyway, at least we
can make it so that we don't have a warning about it. The new code does that.
Fixes#803.
It's now possible to use pgloader to migrate from PostgreSQL to PostgreSQL.
That might be useful for several reasons, including applying user defined
cast rules at COPY time, or just moving from an hosted solution to another.
Given the variety of ways to setup default behavior for datetime and
timestamp data types in MySQL, we need yet more default casting rules. It
might be time to think about a more principled way to solve the problem, but
on the other hand, this ad-hoc one also comes with full overriding
flexibility for the end user.
Fixes#811.
It used to be that extra were forced to being parsed before guards, but
there's no reason why a user wouldn't think to write its clauses the other
way round, so add support for that as well.
See #779.
The MySQL connection string parameter for SSL usage is useSSL, so map an
option name to our expected values for sslmode in database connection
strings.
See #748.
The support for drop default in (user defined) casting rules was completely
broken in SQLite, because the code didn't even bother looking at what's
returning after applying the casting rules.
This patch fixes the code so that is uses the pgcol instance's default
value, as per after applying casting rules. The bug also existed in a subtle
form for MySQL and MS SQL, but would only show up there when the default
value is spelled using a known variation of “current timestamp”.
First review the `sqlite_sequence` support so that we can still work with
databases that don't have this catalog, which doesn't always exists -- it
might depend on the SQLite version though.
Then while at it use the sql macro to host the SQLite “queries” in their own
files, enhancing the hackability of the system to some degrees. Not that
much, because we have to use a lot of PGRAMA command and then the column
output isn't documented with the query text itself.
Namely the actions are “keep extra” and “drop extra” and the casting rule
guard is “with extra on update current timestamp”. Having support for those
elements in the casting rules allow such a definition as the following:
type timestamp with extra on update current timestamp
to "timestamp with time zone" drop extra
The effect of such as cast rule would be to ignore the MySQL extra
definition and then refrain pgloader from creating the PostgreSQL triggers
that implement the same behavior.
Fix#735.
We forgot that rule in the case of creating the target tables for the
materializing views commands, which led to surprising and wrong behavior.
Fix#721, and add a new test case while at it.
The query for concurrency-support didn't get the memo that we should ignore
PostgreSQL identifier-case when querying the source MySQL database. Fix the
query string to include column names as given by the MySQL catalogs.
In bug report #703, the problem is found in PostgreSQL queries. This has
been fixed before already. Trying to reproduce the bug produced an error in
the concurrency-support query instead, so let's fix this one.
Fix#703.
When this function was written, pgloader would get an array of numbers over
the wire, nowadays it looks like it's receiving an array of characters
instead (in other words, a string).
Improve the `bits-to-boolean` function to accept either input, and raise an
error in another case.
My theory is that something changed either in MySQL (with version 10) or in
the Qmynd driver somehow... but tonight we just go easy and fix the bug
locally rather than try and understand where it might be coming from.
Fixes#684.
SQLite being very very liberal in type names (I think it accepts anything
and everything actually), our simple approach of tokenizing the input and
discarding noise words is not enough.
In this patch, we implement a new light parser for the SQLite type names to
better cope with noise words and random spacing of the catalog values that
SQLite failed to normalize. Well it didn't attempt, apparently.
Fix#548.
MySQL allows using unsigned data types and pgloader should then target a
signed type of a larger capacity so that values can fit. For example, the
data definition “smallint(5) unsigned” should be casted to “integer”.
This patch allows user defined cast rules to be written against “unsigned”
data types as per their MySQL catalog representation.
See #678.
The error handling would try and read past the error buffer in some cases,
when the BABEL lib would give a position that's after the buffer read.
Fix#661.
The default values quoting changed in MariaDB 10, and we need to adjust in
pgloader: extra '' chars could defeat the default matching logic:
"'0000-00-00'" is different from "0000-00-00"
The MySQL special syntax "on update current_timestamp()" used to support
only a single column per table (in MySQL), and so did pgloader. In MariaDB
version 10 it's now possible to have several column with that special
treatment, so adapt pgloader to migrate that too.
What pgloader does is recognize that several columns are to receive the same
pre-update processing, and creates a single function that does the both of
them, as in the following example, from pgloader logs in a test case:
CREATE OR REPLACE FUNCTION mysql.on_update_current_timestamp_onupdate()
RETURNS trigger
LANGUAGE plpgsql
AS
$$
BEGIN
NEW.update_date = now();
NEW.calc_date = now();
RETURN NEW;
END;
$$;
CREATE TRIGGER on_update_current_timestamp
BEFORE UPDATE ON mysql.onupdate
FOR EACH ROW
EXECUTE PROCEDURE mysql.on_update_current_timestamp_onupdate();
Fixes#629.
At the moment it's a very manual process, and it might get automated
someday. Meanwhile it's still useful to have.
See #569 for an issue that got a test case added.