236 Commits

Author SHA1 Message Date
Dimitri Fontaine
de4ff30acc Implement --summary to copy the output to a file, fix #68.
Given than redirecting a tty such as *terminal-io* isn't easy enough,
let's provide a way to copy the summary output to a file. Another way to
solve it would have been to output the summary to the main logs, but
that could have made the logs parsing more difficult that necessary.

Let's see how users like it...
2014-06-14 23:31:11 +02:00
Ronan Dunklau
1e208ad29a Take the default password from PGPASSWORD 2014-06-10 11:45:54 +02:00
Dimitri Fontaine
8becf05803 Register "real" datatype to the existing float transformation.
Attempt at fixing #73.
2014-06-05 01:00:33 +02:00
Dimitri Fontaine
1273c42393 Parse SQLite "unsigned" and "short" noise words, fix #72.
In SQLite it's possible to define columns using type names such as
"smallint unsigned" or "short integer", without any changes to the way
those data types are handled, given its "dynamic typing" features.

Improve the pgloader casting machinery for SQLite to handle those cases.
2014-06-04 11:11:50 +02:00
Dimitri Fontaine
b4dac6b684 Fix archive filename matching, recent regression.
The census test didn't pass anymore because I broke the archive filename
matching in b17383fa90b81408fa08566bc131ea5b02606023, where the special
variable *csv-path-root* stoped being authoritative in the archive case.

To fix, initialize that variable to nil and give its value priority as
soon as it's non-nil, such as the archive case.
2014-06-03 10:33:43 +02:00
Dimitri Fontaine
9fc7589a9c Output all reporting to *terminal-io*, fix #68. 2014-06-02 12:08:27 +02:00
Dimitri Fontaine
ae63c9b85c Improve COPY CONTEXT message parsing, fix #67.
When adding the CONTEXT message parsing I totally forgot that PostgreSQL
provides a nice error message translation capability. The code now copes
better with the situation, using a more advanced regular expression.

We could inline the known translations in the matching, but that would
be tedious to maintain, so we just use loose matching rules here.
2014-05-28 17:23:08 +02:00
Dimitri Fontaine
e93ba8b887 Fix handling --client-min-messages and --log-min-messages.
Should help with issue #67 by allowing --client-min-messages to
effectively control entering the debugger in case of unhandled
conditions, etc.

Contrary to the discussion, in this patch --log-min-messages has no
impact on the behavior of the console and interactive behaviors.
2014-05-28 16:37:38 +02:00
Dimitri Fontaine
89d1ab460d Handle both PostgreSQL reserved keywords catcode, fix #63. 2014-05-27 17:36:00 +02:00
Dimitri Fontaine
3454247cc7 Fix MySQL view cleanup.
In case of PostgreSQL schema preparation error, and when some
materialized views where given with their SQL command, they were left
over by pgloader. The next run would then fail because the view already
exists at CREATE VIEW time.

Fix that by cleaning up materialized views we just created in handling
any condition signaled when preparing the PostgreSQL schema.
2014-05-27 17:34:59 +02:00
Dimitri Fontaine
a2370938b6 MATERIALIZE ALL VIEWS.
Complete the MySQL migration feature.
2014-05-26 18:03:50 +02:00
Dimitri Fontaine
e1bf53906d Don't send over useless verbose log messages.
When in :data logging mode we log the whole data set as we read then
write it, which is quite a lot of data. Our current logging system works
by filling up a queue that the cl-log lib is then fed from, and sending
lots of data in that queue is way expensive, stop doing that.

Hopefully we don't need to revisit the logs more than that, the other
messages should be few enough not to count much when doing a full load.
2014-05-26 16:59:12 +02:00
Dimitri Fontaine
2637bb7e81 Avoid double logging the TRUNCATE call, that's scary. 2014-05-26 15:47:20 +02:00
Dimitri Fontaine
e9e9e364b0 Add optional clauses USING FIELDS and TARGET COLUMNS. 2014-05-26 15:04:06 +02:00
Dimitri Fontaine
b17383fa90 Allow IN DIRECTORY sub-clause for the FILENAME MATCHING clause.
With this the user is now able to have a way about where the files are
going to be read and matched against the regular expression. It used not
to be necessary in the archive expansion mode, but is required now that
the feature is exposed in more cases.
2014-05-26 14:45:12 +02:00
Dimitri Fontaine
36805afc64 Fix *csv-path-root* at run-time.
When using LOAD CSV it's possible to load from filename matching a
regular expression, but for that to work the *csv-path-root* needs to be
properly setup at run-time.
2014-05-26 11:01:19 +02:00
Dimitri Fontaine
51b9618cf6 Fix a call to truncate-tables which didn't get the memo.
In passing, have a default identifier-case of :downcase.
2014-05-26 10:59:35 +02:00
Dimitri Fontaine
92ebb13042 Keep the order in which we saw files when matching filenames. 2014-05-26 10:58:30 +02:00
Dimitri Fontaine
c21f3f06ff Retain NULL tinyints in tinyint-to-boolean, fix #65. 2014-05-24 00:03:41 +02:00
Dimitri Fontaine
b1ba09a21b Handle MySQL FK column names idenfier case, fix #62.
The code forgot completely that MySQL column name references in foreign
key definitions have to follow the identifier case rules, this patch fix
that.

To be able to do that, we need to parse the GROUP_CONCAT() result that
lists the FK columns, as there's apparently no arrays in MySQL. The
problem here is that about any character is allowed in column names when
`quoted`, so using a comma here might reveal to be fragile later.
2014-05-22 12:35:54 +02:00
Dimitri Fontaine
e710cacad1 Truncate all tables in a single command, fix #61.
The truncate command is only sent to PostgreSQL when we didn't just
CREATE TABLE before. Some refactoring would be necessary to fit the
TRUNCATE command within the same transaction as the CREATE TABLE
command, for PostgreSQL performances.

This patch has been testing with MySQL and SQLite sources, the trick is
that to be able to test it, it's needed to first make a full
import (creating the target tables), so the test are not modified yet.
2014-05-19 18:07:35 +02:00
Dimitri Fontaine
9e12035ca1 Review SQLite blob types in light of "manifest typing", fix #60.
When using SQLite 3, a blob column might return either string of byte
vector values dynamically depending on the data itself, or maybe some
more complex parameters controlled at data insert time.

Hard-code the rule that a blob column returned as a string is in fact
base64 encoded (which looks like common practice) and decode it
automatically when needed, before sending to byte-vector-to-bytea. It
might be a tad slow but at least the data is properly converted.

In future, that decision might come and byte us in the back again, at
which point it'll be necessary to consider full casting options as in
the MySQL CAST rules. It seems like a big enough win for now if we can
avoid that.
2014-05-16 23:13:57 +02:00
Dimitri Fontaine
39af63b053 Implement support for SQLite blob to bytea, fixes #59.
This issue has been re-opened with blob instead of double. Semi-blindly
implement support for the blob type with an image data type.

Disturbingly enough when tested with non-binary data SQLite was
returning strings rather than byte vectors, tripping up the transform
function that sure expects byte vectors.
2014-05-16 00:28:02 +02:00
Dimitri Fontaine
d6c457d89a Add support for SQLite "double" data type, Fix #59.
This time with a test case rather than trying to blindly address the
problem in a very small amount of time.
2014-05-15 23:28:21 +02:00
Dimitri Fontaine
efda3eebfa Attempt at fixing #59 (sqlite double type casting).
Blindly add a new type conversion in the SQLite code base to handle the
source type DOUBLE and convert it to "double precision".
2014-05-13 11:57:03 +02:00
Dimitri Fontaine
d7b05ba411 Handle more conditions, fix #57.
Turns out that in cases it's not possible to call format-vector-row on
MySQL result sets, because it's been sending us vector of bytes (blob)
while the expected data (from the table definition) clearly is text.

Handle the error as an input reading error, skipping the line and being
verbose about it in the logs. This patch fails to update the stats about
what's happening because, so might need later changes.
2014-05-11 18:52:07 +02:00
Dimitri Fontaine
6d92dc251f Fix another useless use of loop. 2014-05-11 18:49:31 +02:00
Dimitri Fontaine
c38798a4dd Implement BEFORE/AFTER LOAD EXECUTE 'filename'.
That allows using the same SQL files as usual when using pgloader, as it
even supports the \i and \ir psql features (and dollar quoting, etc).

In passing, refactor docs to avoid saying the same things all over the
place, which isn't a very good idea in a man page, at least as far
editing it is involved.
2014-05-04 23:04:45 +02:00
Dimitri Fontaine
267a1cc755 The Useless Use Of Loop did strike. 2014-05-03 15:55:02 +02:00
Dimitri Fontaine
e39788e5cd Fix some CCL warnings.
Those were preventing a buildapp based build.
2014-05-03 15:36:30 +02:00
Dimitri Fontaine
6e58db2994 Improve self-upgrading.
There's no reason not to parse again the command line with the newly
loaded code actually, so be sure to do the self-upgrade dance first
thing and recurse to the pgloader::main function (with a guard).
2014-05-03 15:22:34 +02:00
Dimitri Fontaine
f34017d023 Improve version strings.
Have the abbreviated git hash appear in the version string when not
using a released version of pgloader.
2014-05-03 15:21:32 +02:00
Dimitri Fontaine
fecae2c2d9 Implement --self-upgrade capacity.
As from now, to install a new version of pgloader when you have an older
one, say because there's that bug that got fixed meanwhile, all you need
to do is run

  $ git clone https://github.com/dimitri/pgloader.git /tmp/pgloader
  $ pgloader --self-upgrade /tmp/pgloader <options as usual>

Any Common Lisp developper using the product is already doing that many
times a day, it might prove useful for users to be able to hot-patch
themselves too, after all.
2014-05-03 00:25:44 +02:00
Dimitri Fontaine
1d480c2590 Refactor the parser connection bindings code production.
Every command was maintaining its own copy of what should have been the
same code from day one, centralize it.
2014-05-02 23:46:35 +02:00
Dimitri Fontaine
ee498111bc Implement MySQL local (socket) connection. Fix #39.
The parser was happily parsing such a connection string as the
following, but the rest of the code didn't really know what to do about
it:

  mysql://unix:/var/run/mysqld/mysqld.sock:/main

In passing, fix bugs where the PostgreSQL unix domain socket connection
was still shy of a brick load, omitting to consider the case where the
connection host is actually a list of '(:unix . "path/to/socket").
2014-05-02 22:48:17 +02:00
Dimitri Fontaine
182128775b Another encoding and external formats fix for portability.
Some of our internal values now depend on the implementation, and could
either be a symbol on SBCL or an external-format structure on CCL. We
could typecase our way out I suppose, but it might be that SBCL has a
different version of the external-format type, so we'd rather use #+.
2014-04-29 15:25:56 +02:00
Dimitri Fontaine
f0cc4ddef9 Fix filename matching when no match is found. 2014-04-29 14:49:55 +02:00
Dimitri Fontaine
f5f584fdf1 Fix parsing ccl:describe-character-encodings.
First, despite the documentation mentionning the function writes
to *terminal-io*, in fact it's doing (format t ...) and thus the result
is written to *standard-output*.

Second, CCL has encodings with no aliases.
2014-04-29 14:25:40 +02:00
Dimitri Fontaine
a5a29407f0 Release pgloader version 3.0.99. 2014-04-29 13:59:33 +02:00
Dimitri Fontaine
c0d9bb4d8f Allows to build pgloader image using CCL.
Too many Makefile commands where hard-coded using SBCL, which prevented
from building successfully against CCL. That's now fixed.
2014-04-29 11:47:22 +02:00
Dimitri Fontaine
40128dbd75 Fix with-monitor support of :start-logger option.
It used to still launch an extra set of threads for monitoring where,
and that would confuse CCL where it's not possible to write into a
stream from more than one thread concurrently.
2014-04-29 11:43:03 +02:00
Dimitri Fontaine
0f62751a3f Improve summary output.
Try at having a deterministic ouput of it, which still apparently is not
always the case when using SBCL, now that it's been switched to using
the explicit *terminal-io* rather than t.

This change is needed for CCL support, though, where you don't get to
write to the same stream from different threads.
2014-04-29 11:42:02 +02:00
Dimitri Fontaine
3abcfeb569 Avoid empty index definitions in SQLite, fixes #52.
I could get down to the problem here, which is that a couple of indexes
where reported to pgloader but without any SQL definition for them, and
then pgloader would wait for non existing tasks.

It seems easier to just skip does indexes, that's what this patch does.
2014-04-28 16:00:34 +02:00
Dimitri Fontaine
9516a90d9d Fix SQLite support for filename parsing.
The code didn't get the memo about the way we now do support source
filenames and all.
2014-04-28 15:20:30 +02:00
Dimitri Fontaine
b758058208 Fix the fix for parsing quoted-filenames. 2014-04-28 15:18:18 +02:00
Dimitri Fontaine
b5c89e750c Quick review of the generic API documentation strings. 2014-04-28 14:36:15 +02:00
Dimitri Fontaine
429232c3de Fix loading data from stdin: fix #53.
The stdin support really was one brick shy of a load, and in particular
with-open-file was used against a stream when using that option.
2014-04-27 23:38:02 +02:00
Dimitri Fontaine
b5dec87915 Allow any non-quote characters in a quoted filename.
In particular, allow for a space to be used in the filename. The only
character that is not permitted anymore is the quote itself ('), it
should be easy enough to allow for escaping it as in the password field
if required.

Should probably fix #54, even though the lack of data currently reported
in that issue makes it a blind guess only.
2014-04-27 22:49:27 +02:00
Dimitri Fontaine
efd11ab759 Add user options to control pgloader batch behaviour.
The new WITH options allows the user to set values for the dynamic
variables *copy-batch-rows*, *copy-batch-size* and *concurrent-batches*.
That's needed in case like in issue #16 even with the batch size
defaulting to what looks like a proper setup.

In a longer term a review of the pgloader memory usage should be done
seriously, the numbers being way higher than the batch sizes we do setup
here.
2014-04-27 22:37:17 +02:00
Dimitri Fontaine
78a988eb47 Oops, forgot to add the new file charsets.lisp. 2014-04-26 18:55:43 +02:00