Commit Graph

245 Commits

Author SHA1 Message Date
Dimitri Fontaine
ebc72c454e Add a MySQL use case.
See #1102.
2020-03-22 15:20:19 +01:00
Dimitri Fontaine
d8b0bd5145 Allow casting rules to guard on signed data types.
It used to be that our casting rules mechanism would allow for matching
unsigned data types only, and we sometimes have a need to do special
behavior on signed data types.

In particular, a signed bigint(20) in MySQL has the same values range as a
PostgreSQL bigint, so we don't need to target a numeric in that case. It's
only when the bigint is unsigned that we need to target a numeric.

In passing update some of the default casting rules documentation to match
the code.

Fix #982.
2019-06-04 15:22:25 +02:00
Dimitri Fontaine
b8da7dd2e9
Generic Function API for Materialized Views support. (#970)
Implement a generic-function API to discover the source database schema and
populate pgloader internal version of the catalogs. Cut down three copies of about
the same code-path down to a single shared one, thanks to applying some amount
of OOP to the code.
2019-05-20 19:28:38 +02:00
Dimitri Fontaine
12e788094b Improve sexp parser and standard symbols support.
Also add split-sequence to the list of special cases that we can use and is
not found in the pgloader.transforms package.

Fixes #965.
2019-05-14 15:49:24 +02:00
Dimitri Fontaine
06216eea99 Refrain from using fancy schema names in the regression tests.
Then being compatible with PostgreSQL 9.6 is too much trouble.
2019-05-12 00:10:28 +02:00
Dimitri Fontaine
e5f78d978e Remove added DBF tests from the Travis target.
Clozure doesn't have the CP866 encoding that the DBF files are using, and
then PostgreSQL 9.6 doesn't have "create schema if not exists", which makes
the tests fail on Travis.
2019-05-11 23:52:47 +02:00
Dimitri Fontaine
98b465fbef Add the new DBF tests in the test suite.
All with expected results so that we can track regressions there.
2019-05-11 22:13:18 +02:00
Dimitri Fontaine
350cffffad Improve DBF support.
The cl-db3 lib just got improvements for new dbase file types and field
types, reflect those in pgloader.

Also, cl-db3 now can read the encoding of the file (language driver)
directly in the header, meaning we can rely on that metadata by default, and
only override it when the users tells us to.

See #961.
2019-05-11 20:50:52 +02:00
Dimitri Fontaine
7d2e5ae941 Allow lambda expression as cast rules using functions.
Before that it was necessary to install a function in the lisp environment
either in the source itself in src/utils/transforms.lisp, or in a lisp file
loaded with --load-lisp-file (or -l for shorts).

While this could be good enough, sometimes a very simple combination of
existing features is required to transform a function and so doing some
level of lisp coding directly in the load command is a nice to have.

Fixes #961.
2019-05-08 19:02:18 +02:00
Dimitri Fontaine
b1d55e07d6 Implement header parsing for fixed file formats.
Fixed file formats might contain an header line with column names and a hint
of the size of each column. While it might be a long shot that we can
acutally use that as a proper fixed-format specification, this patch
implements a guess mode that also outputs the parsed header.

In case when the parsing is wrong in some level of details, it might
actually be a good start to copy/paste from the command output and go from
there.

Fixes #958.
2019-05-07 22:57:55 +02:00
Dimitri Fontaine
27b1a83b9f Allow quoted table and column names in user-defined CASTing rules.
Fixes #955.
2019-04-30 16:55:26 +02:00
Dimitri Fontaine
febb2c11be Update HTTP/ZIP/DBF test case.
The URL of the test case source has changed. Use the new one. Also set the
encoding properly, the client_encoding trick has been deprecated for awhile
now, as pgloader only talks to Postgres in UTF-8.
2019-04-28 23:34:31 +02:00
Dimitri Fontaine
513455f552 Implement support for MySQL bitstrings.
We migrate bit(xx) to the same PostgreSQL datatype bit(xx) where in Postgres
we can use bitstring as documented at the following URL. In particular the
COPY syntax accepts the notation Xabcd for the values, which is quite nice
when MySQL sends the data to us a a byte vector:

  https://www.postgresql.org/docs/current/datatype-bit.html

Fixes #943.
2019-04-19 12:52:04 +02:00
Dimitri Fontaine
39fc78e08f Implement user-defined casting rules support for DB3.
The casting support for DB3 was hand-crafted and didn't get upgraded to
using the current CAST grammar and facilities, for no other reasons than
lack of time and interest. It so happens what implementing it now fixes two
bug reports.

Bug #938 is about conversion defaulting to "not null" column, and that's due
to the usage of the internal pgloader catalogs where the target column's
nullable field is NIL by default, which doesn't make much sense. With
support for user-defined casting rules, the default is nullable columns, so
that's kind of a free fix.

Fixes #927.
Fixes #938.
2019-04-14 21:46:04 +02:00
Dimitri Fontaine
957caa877e Add a new transform function for byte-vector to hexadecimal string.
In some cases when migrating from MySQL we want to transform data from
binary representation to an hexadecimal number. One such case is going from
MySQL binary(16) to PostgreSQL UUID data type.

Fixes #904.
2019-03-18 14:21:33 +01:00
Dimitri Fontaine
dae5dec03c Allow fields/columns projections when parsing header.
When using a CSV header, we might find fields in a different order than the
target table columns, and maybe not all of the fields are going to be read.
Take account of the header we read rather than expecting the header to look
like the target table definition.

Fix #888.
2019-01-15 22:39:08 +01:00
Dimitri Fontaine
2147a1d07b Implement ALTER TABLE ... SET TABLESPACE ... as a pgloader clause.
This allows creating tables in any target tablespace rather than the default
one, and is supported for the various sources having support for the ALTER
TABLE clause already.
2019-01-08 22:50:24 +01:00
Dimitri Fontaine
b8e8cf7d18 Fix bugs in the recent extended support for materialized views.
Materialized views without an explicit schema name are supported, but then
would raise an error when trying to use destructuring-bind on a string
rather than the (cons schema-name table-name). This patch fixes that.
2018-12-28 10:53:01 +01:00
Dimitri Fontaine
bda06f8ac0 Implement Citus support from a MySQL database. 2018-12-17 16:31:47 +01:00
Dimitri Fontaine
290ad68d61 Implement materialize views in PostgreSQL source support. 2018-12-16 23:17:37 +01:00
Dimitri Fontaine
3f2f10eef1 Finish implementation of CAST rules for PostgreSQL source databases.
Add a link to the table from the internal catalogs for columns so that we
can match table-source-name in cast rules when migrating from PostgreSQL.
2018-11-19 19:33:37 +01:00
Dimitri Fontaine
5ecf04acb9 Implement null if support as a WITH option.
This gives a default "null if" option to all the input columns at once, and
it's still possible to override the default per column.

In passing, fix project-fields declarations that SBCL now complains about
when they're not true, such as declaring a vector when we might have :null
or nil. As a result, remove the (declare (optimize speed)) in the generated
field processing code.
2018-11-13 21:41:27 +01:00
Dimitri Fontaine
656bf85075 Review field to column projection code emitted.
The code emitted by pgloader to transform input fields into PostgreSQL
column values was using too many optimization declarations, some of them
that SBCL failed to follow through for lack of type marking in the generated
code.

As SBCL doesn't have enough information to be optimizing anyway, at least we
can make it so that we don't have a warning about it. The new code does that.

Fixes #803.
2018-11-10 20:22:04 +01:00
Dimitri Fontaine
7b487ddaca Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00
Dimitri Fontaine
d356bd501b Accept even more ragged date format input.
When parsing a date string from a date format, accept that the ms or us part
be completely missing, rather than just missing some digits.

Fixed #828.
2018-09-10 19:37:36 +02:00
Dimitri Fontaine
fc3a1949f7 Add support for PostgreSQL as a source database.
It's now possible to use pgloader to migrate from PostgreSQL to PostgreSQL.
That might be useful for several reasons, including applying user defined
cast rules at COPY time, or just moving from an hosted solution to another.
2018-08-20 11:09:52 +02:00
Dimitri Fontaine
46d14af0d3 Add more default rules to MySQL datetime handling.
Given the variety of ways to setup default behavior for datetime and
timestamp data types in MySQL, we need yet more default casting rules. It
might be time to think about a more principled way to solve the problem, but
on the other hand, this ad-hoc one also comes with full overriding
flexibility for the end user.

Fixes #811.
2018-07-08 20:37:06 +02:00
Christoph Berg
cb528c2e19 All included test data has been verified as free, stop building a +dfsg tarball. 2018-07-03 16:37:43 +02:00
Dimitri Fontaine
7d8ac3b352 Reproduce a test case from issue #795. 2018-06-15 22:14:04 +02:00
Dimitri Fontaine
a0bac47101 Refrain from TRUNCAT'ing an empty list of tables.
Fixed #789.
2018-06-15 17:46:31 +02:00
Christoph Berg
30f90cb848 test/Makefile: Allow configuring the superuser database name
Also, don't ignore errors while setting up the database
2018-06-04 10:52:14 +02:00
Dimitri Fontaine
1fe835d31b Add sample data for fields escaped by double-quote.
See #787.
2018-04-29 19:05:52 +02:00
Dimitri Fontaine
a392328dad Allow any ordering of guards and extra cast rule clauses.
It used to be that extra were forced to being parsed before guards, but
there's no reason why a user wouldn't think to write its clauses the other
way round, so add support for that as well.

See #779.
2018-04-29 19:00:20 +02:00
Dimitri Fontaine
e4dca1a086 Implement support for MySQL useSSL=true|false option.
The MySQL connection string parameter for SSL usage is useSSL, so map an
option name to our expected values for sslmode in database connection
strings.

See #748.
2018-03-16 16:41:40 +01:00
Dimitri Fontaine
4301503df2 Add a new test case for {{ENVVAR}} template support.
See #555.
2018-02-20 14:45:07 +01:00
Dimitri Fontaine
ea6c91b429 Fix "drop default" casting rules for all databases.
The support for drop default in (user defined) casting rules was completely
broken in SQLite, because the code didn't even bother looking at what's
returning after applying the casting rules.

This patch fixes the code so that is uses the pgcol instance's default
value, as per after applying casting rules. The bug also existed in a subtle
form for MySQL and MS SQL, but would only show up there when the default
value is spelled using a known variation of “current timestamp”.
2018-02-08 23:33:51 +01:00
Dimitri Fontaine
29506e6fa6 Assorted fixes for SQLite.
First review the `sqlite_sequence` support so that we can still work with
databases that don't have this catalog, which doesn't always exists -- it
might depend on the SQLite version though.

Then while at it use the sql macro to host the SQLite “queries” in their own
files, enhancing the hackability of the system to some degrees. Not that
much, because we have to use a lot of PGRAMA command and then the column
output isn't documented with the query text itself.
2018-02-08 22:55:15 +01:00
Dimitri Fontaine
4612e68435 Implement support for new casting rules guards and actions.
Namely the actions are “keep extra” and “drop extra” and the casting rule
guard is “with extra on update current timestamp”. Having support for those
elements in the casting rules allow such a definition as the following:

      type timestamp with extra on update current timestamp
        to "timestamp with time zone" drop extra

The effect of such as cast rule would be to ignore the MySQL extra
definition and then refrain pgloader from creating the PostgreSQL triggers
that implement the same behavior.

Fix #735.
2018-01-31 15:17:05 +01:00
Dimitri Fontaine
7b08b6e3d3 Refrain from creating tables in “data only” operations.
We forgot that rule in the case of creating the target tables for the
materializing views commands, which led to surprising and wrong behavior.

Fix #721, and add a new test case while at it.
2018-01-25 23:32:31 +01:00
Dimitri Fontaine
07cdf3e7e5 Use MySQL column names in MySQL queries.
The query for concurrency-support didn't get the memo that we should ignore
PostgreSQL identifier-case when querying the source MySQL database. Fix the
query string to include column names as given by the MySQL catalogs.

In bug report #703, the problem is found in PostgreSQL queries. This has
been fixed before already. Trying to reproduce the bug produced an error in
the concurrency-support query instead, so let's fix this one.

Fix #703.
2017-12-22 14:15:46 +01:00
Dimitri Fontaine
b7d87a9eb1 Fix MySQL bit(1) casting function.
When this function was written, pgloader would get an array of numbers over
the wire, nowadays it looks like it's receiving an array of characters
instead (in other words, a string).

Improve the `bits-to-boolean` function to accept either input, and raise an
error in another case.

My theory is that something changed either in MySQL (with version 10) or in
the Qmynd driver somehow... but tonight we just go easy and fix the bug
locally rather than try and understand where it might be coming from.

Fixes #684.
2017-12-03 23:06:54 +01:00
Dimitri Fontaine
52f13456d9 Rewrite the SQLite type name parsing.
SQLite being very very liberal in type names (I think it accepts anything
and everything actually), our simple approach of tokenizing the input and
discarding noise words is not enough.

In this patch, we implement a new light parser for the SQLite type names to
better cope with noise words and random spacing of the catalog values that
SQLite failed to normalize. Well it didn't attempt, apparently.

Fix #548.
2017-11-28 18:19:12 +01:00
Dimitri Fontaine
2b861a3e96 New SQLite test cases. 2017-11-25 16:31:42 -08:00
Dimitri Fontaine
5c60f8c35c Implement a new type casting guard: unsigned.
MySQL allows using unsigned data types and pgloader should then target a
signed type of a larger capacity so that values can fit. For example, the
data definition “smallint(5) unsigned” should be casted to “integer”.

This patch allows user defined cast rules to be written against “unsigned”
data types as per their MySQL catalog representation.

See #678.
2017-11-22 10:26:03 -08:00
Dimitri Fontaine
1d7706c045 Fix the MySQL encoding error handling.
The error handling would try and read past the error buffer in some cases,
when the BABEL lib would give a position that's after the buffer read.

Fix #661.
2017-11-13 11:27:47 +01:00
Dimitri Fontaine
a9afddf8ed Accept quoted namestrings as target type names for cast rules.
This allows passing "double precision" rather than float8, for example.

Fix #650.
2017-10-21 21:03:58 +02:00
Dimitri Fontaine
b36f36b74e Add a (local) test case. 2017-10-16 17:25:44 +02:00
Dimitri Fontaine
460fe6cc77 Fix quoting of default values for MariaDB 10 support.
The default values quoting changed in MariaDB 10, and we need to adjust in
pgloader: extra '' chars could defeat the default matching logic:

  "'0000-00-00'" is different from "0000-00-00"
2017-09-19 11:29:53 +02:00
Dimitri Fontaine
8a361a0ff8 Add support for multiple on update columns per table.
The MySQL special syntax "on update current_timestamp()" used to support
only a single column per table (in MySQL), and so did pgloader. In MariaDB
version 10 it's now possible to have several column with that special
treatment, so adapt pgloader to migrate that too.

What pgloader does is recognize that several columns are to receive the same
pre-update processing, and creates a single function that does the both of
them, as in the following example, from pgloader logs in a test case:

    CREATE OR REPLACE FUNCTION mysql.on_update_current_timestamp_onupdate()
      RETURNS trigger
      LANGUAGE plpgsql
      AS
    $$
    BEGIN
       NEW.update_date = now();
       NEW.calc_date = now();
       RETURN NEW;
    END;
    $$;
    CREATE TRIGGER on_update_current_timestamp
            BEFORE UPDATE ON mysql.onupdate
          FOR EACH ROW
      EXECUTE PROCEDURE mysql.on_update_current_timestamp_onupdate();

Fixes #629.
2017-09-15 01:04:57 +02:00
Dimitri Fontaine
b7347a567c Add test cases for MySQL.
At the moment it's a very manual process, and it might get automated
someday. Meanwhile it's still useful to have.

See #569 for an issue that got a test case added.
2017-09-14 15:59:10 +02:00