101 Commits

Author SHA1 Message Date
Dimitri Fontaine
8a0c91fa40 Fix conjunctions for the INCLUDING clause in MySQL.
We want all table matching any of the given constraints (regexp or
equality search).
2014-09-23 16:44:38 +02:00
Dimitri Fontaine
52e3371be8 Review default SQLite options. 2014-09-22 14:34:57 +02:00
Dimitri Fontaine
d4b58a1f78 Review MySQL default options.
The default values for MySQL WITH clause options wasn't really tested
and broke on simple cases, the new set of defaults is known to work in
many cases (most?).

Other combinations of options will need some review work, and we might
need to consider preventing some of them, that's for another patch tho.
2014-09-21 12:19:20 -05:00
Dimitri Fontaine
ca52ddacb1 SQLite: transform "0" timestamps to NULL, see #100. 2014-07-30 18:42:49 +02:00
Dimitri Fontaine
ed8022ce64 SQLite: transform default values to their PostgreSQL representation.
When default values are used in SQLite they are of course using their
SQLite representation, which might not be compatible with the PostgreSQL
target data type we're casting to. Make it so that the default values
are transformed too, as we already do in the MySQL case.

See #100.
2014-07-30 16:32:35 +02:00
Dimitri Fontaine
4aa8b0946f Get rid of unused sample file. 2014-07-18 11:07:25 +02:00
Dimitri Fontaine
cb4b2a3334 Convert SQLite tinyint to PostgreSQL smallint, fixes #97. 2014-07-18 11:06:37 +02:00
Dimitri Fontaine
9eff1bb4d8 Travis: Adapt test/ixf.load to work against 9.1. 2014-07-17 17:35:41 +02:00
Dimitri Fontaine
5a2b98856f Update the main SQLite test database. 2014-07-17 16:56:28 +02:00
Dimitri Fontaine
07b5aa3ed6 Add BEFORE/AFTER LOAD clauses to IXF and DBF commands. 2014-07-17 16:56:13 +02:00
Dimitri Fontaine
3e0526c957 Implement early support for IXF files. 2014-07-14 21:53:50 +02:00
Dimitri Fontaine
53a7e47058 New MySQL default Cast Rule for bit(1) to boolean, fix #93.
We need a new transformation function that work with a vector of
integers as input.
2014-07-03 11:47:59 +02:00
Dimitri Fontaine
6d49d9e10a Add a "real" column test case in SQLite main test, Closes #73. 2014-06-29 16:33:42 +02:00
Dimitri Fontaine
55655ed927 Fix fixed-file column name quoting, as we did for CSV, fixes #70. 2014-06-29 16:25:30 +02:00
Dimitri Fontaine
2ac374bcc8 Improve internal testing a bit.
This should get into a full reproducible regression test against MySQL
some day.
2014-06-25 12:43:57 +02:00
Dimitri Fontaine
f7d251ed86 Fix quoting of TRUNCATE command, fix #84.
That patch is not a principaled approach at fixing the problem but
should allow for not messing up with fully qualified table names.

A proper way to do it would be to have a pgsql object name structure
composed of the catalog, the schema and the name as separate entries,
with assorted API to print that object properly. That's for another day
though.
2014-06-20 13:10:39 +02:00
Dimitri Fontaine
63e6b506be Travis; tweak some tests for PostgreSQL 9.1 compat. 2014-06-17 01:00:33 +02:00
Dimitri Fontaine
c14d620d83 Travis: let's see some debug information. 2014-06-16 21:15:22 +02:00
Dimitri Fontaine
32b4cf23e8 New test case showing off the 'null if' source field option, see #80. 2014-06-16 19:59:08 +02:00
Dimitri Fontaine
f5c703c206 Handle camelCase column names for CSV, fix #79 again.
The previous patch didn't take into account the need to retain the case
of the PostgreSQL column names when using double-quotes in the load
command, which is now properly forwarded down in the COPY command.
2014-06-16 17:33:14 +02:00
Dimitri Fontaine
fe4f577300 Add the csv-filename-pattern test to the suite. 2014-06-16 14:17:53 +02:00
Dimitri Fontaine
65aabb8216 Add a dbf test to the regression suite. 2014-06-16 14:17:33 +02:00
Dimitri Fontaine
7db001a7c3 Allow the MySQL command parser to process clauses in any order, fix #56.
Only the MySQL command is addressed in this patch, because the code
level approach is not safisfying me completely. It might be easier to
just bite the bullet and review all the optional clauses return values
rather than add a layer as this patch does.

The feature still is available for MySQL given this patch, so let's push
it, get feedback, then see about how to make the approach scale and
revise all the other commands.
2014-06-15 14:19:38 +02:00
Dimitri Fontaine
1273c42393 Parse SQLite "unsigned" and "short" noise words, fix #72.
In SQLite it's possible to define columns using type names such as
"smallint unsigned" or "short integer", without any changes to the way
those data types are handled, given its "dynamic typing" features.

Improve the pgloader casting machinery for SQLite to handle those cases.
2014-06-04 11:11:50 +02:00
Dimitri Fontaine
f6fae39b2e Explicitely use gawk in the new regression testing facility.
Turns out that debian has mawk by default, which is not behaving the
same in our very simple use case already. In passing, add gawk as a
build dependency of the debian package, because the packaging is meant
to exercize the test cases.
2014-06-03 13:39:21 +02:00
Dimitri Fontaine
3bcd236de6 Add automated regression tests.
Those tests currently only work when a single table is the target of the
load, and when this target is explicit in the INTO target clause. More
work needs to be done to cover interesting cases like MySQL and SQLite
where we want to diff a full database rather than a single table.
2014-06-03 12:19:23 +02:00
Dimitri Fontaine
ae2f7e9ed0 Add an hstore test
This test is currently commented out of the test suite so that we don't
require the hstore being available to run the basic tests.
2014-06-03 10:33:43 +02:00
Dimitri Fontaine
a2370938b6 MATERIALIZE ALL VIEWS.
Complete the MySQL migration feature.
2014-05-26 18:03:50 +02:00
Dimitri Fontaine
e9e9e364b0 Add optional clauses USING FIELDS and TARGET COLUMNS. 2014-05-26 15:04:06 +02:00
Dimitri Fontaine
b17383fa90 Allow IN DIRECTORY sub-clause for the FILENAME MATCHING clause.
With this the user is now able to have a way about where the files are
going to be read and matched against the regular expression. It used not
to be necessary in the archive expansion mode, but is required now that
the feature is exposed in more cases.
2014-05-26 14:45:12 +02:00
Dimitri Fontaine
36805afc64 Fix *csv-path-root* at run-time.
When using LOAD CSV it's possible to load from filename matching a
regular expression, but for that to work the *csv-path-root* needs to be
properly setup at run-time.
2014-05-26 11:01:19 +02:00
Dimitri Fontaine
9e12035ca1 Review SQLite blob types in light of "manifest typing", fix #60.
When using SQLite 3, a blob column might return either string of byte
vector values dynamically depending on the data itself, or maybe some
more complex parameters controlled at data insert time.

Hard-code the rule that a blob column returned as a string is in fact
base64 encoded (which looks like common practice) and decode it
automatically when needed, before sending to byte-vector-to-bytea. It
might be a tad slow but at least the data is properly converted.

In future, that decision might come and byte us in the back again, at
which point it'll be necessary to consider full casting options as in
the MySQL CAST rules. It seems like a big enough win for now if we can
avoid that.
2014-05-16 23:13:57 +02:00
Dimitri Fontaine
39af63b053 Implement support for SQLite blob to bytea, fixes #59.
This issue has been re-opened with blob instead of double. Semi-blindly
implement support for the blob type with an image data type.

Disturbingly enough when tested with non-binary data SQLite was
returning strings rather than byte vectors, tripping up the transform
function that sure expects byte vectors.
2014-05-16 00:28:02 +02:00
Dimitri Fontaine
d6c457d89a Add support for SQLite "double" data type, Fix #59.
This time with a test case rather than trying to blindly address the
problem in a very small amount of time.
2014-05-15 23:28:21 +02:00
Dimitri Fontaine
c38798a4dd Implement BEFORE/AFTER LOAD EXECUTE 'filename'.
That allows using the same SQL files as usual when using pgloader, as it
even supports the \i and \ir psql features (and dollar quoting, etc).

In passing, refactor docs to avoid saying the same things all over the
place, which isn't a very good idea in a man page, at least as far
editing it is involved.
2014-05-04 23:04:45 +02:00
Dimitri Fontaine
ee498111bc Implement MySQL local (socket) connection. Fix #39.
The parser was happily parsing such a connection string as the
following, but the rest of the code didn't really know what to do about
it:

  mysql://unix:/var/run/mysqld/mysqld.sock:/main

In passing, fix bugs where the PostgreSQL unix domain socket connection
was still shy of a brick load, omitting to consider the case where the
connection host is actually a list of '(:unix . "path/to/socket").
2014-05-02 22:48:17 +02:00
Dimitri Fontaine
9084a01086 Switch a test case to the "utf-8" spelling.
The spelling "utf8" is not recognized by CCL.
2014-04-29 15:24:12 +02:00
Dimitri Fontaine
1d5d0ae72f Update the bossa.load test case.
The archive contents seem to have changed, and the regular expression to
match files that we were using doesn't match any filename in the archive
any more.

Also, have the command load more data by parsing more files, using the
ALL FILENAME MATCHING clause.
2014-04-29 14:51:14 +02:00
Dimitri Fontaine
429232c3de Fix loading data from stdin: fix #53.
The stdin support really was one brick shy of a load, and in particular
with-open-file was used against a stream when using that option.
2014-04-27 23:38:02 +02:00
Dimitri Fontaine
efd11ab759 Add user options to control pgloader batch behaviour.
The new WITH options allows the user to set values for the dynamic
variables *copy-batch-rows*, *copy-batch-size* and *concurrent-batches*.
That's needed in case like in issue #16 even with the batch size
defaulting to what looks like a proper setup.

In a longer term a review of the pgloader memory usage should be done
seriously, the numbers being way higher than the batch sizes we do setup
here.
2014-04-27 22:37:17 +02:00
Dimitri Fontaine
789d854799 Fix issue #49 where data could be considered as a format string. 2014-04-23 17:03:35 +02:00
Dimitri Fontaine
56f3da28ed Fix #20 by skipping table and view missing from the catalogs. 2014-03-04 14:01:04 +01:00
Dimitri Fontaine
46fd6632f2 Fix #40 by providing a per-table forced-encoding option.
This patch takes benefits from the recent patch
62fc85a1cf
so that you will need to freshen your local Qmynd copy if you want to
test from sources.
2014-03-03 23:39:22 +01:00
Dimitri Fontaine
b033aed88b Some more testing. 2014-03-02 01:27:02 +01:00
Dimitri Fontaine
7befe27807 Add encoding errors to the csv-error.load test. 2014-03-02 01:27:02 +01:00
Alexander Pánek
97109153f4 Removed tinyint cast rule
This rule has overridden the default rule for `tinyint(1)` and instead of placing `boolean`, it kept the typemod and placed `boolean(1)` into the resulting query.
2014-02-17 16:14:27 +01:00
Dimitri Fontaine
643875a266 Improve CSV error handling, thanks to cl-csv continue restart. 2014-02-08 17:51:15 +01:00
Dimitri Fontaine
dbfd8cf06c Implement new CSV option "lines terminated by", fixes #23. 2014-02-04 20:58:46 +01:00
Dimitri Fontaine
a8b0f91f37 Allow optional control of batch memory footprint, see #16 and #22.
With the new internal setting *copy-batch-size* it's now possible to
instruct pgloader to close batches early (before *copy-batch-rows* limit)
when crossing the byte count threshold.

When set to 20 MB it allows the new test case (exhausted) to pass under SBCL
and CCL, and there's no measurable cost when *copy-batch-size* is set to
nil (its default value) in the testing done.

This patch is published without any way to tune the values from the command
language yet, that's the next step once its been proven effective.
2014-01-26 23:22:18 +01:00
Dimitri Fontaine
8ac2cc4930 Skip empty lines when reading from files. 2014-01-24 15:11:15 +01:00