Commit Graph

790 Commits

Author SHA1 Message Date
Alexander Pánek
97109153f4 Removed tinyint cast rule
This rule has overridden the default rule for `tinyint(1)` and instead of placing `boolean`, it kept the typemod and placed `boolean(1)` into the resulting query.
2014-02-17 16:14:27 +01:00
Grégoire HUBERT
5ff2b9c4b3 Added minimum SBCL version to compile 2014-02-17 14:18:16 +01:00
Grégoire HUBERT
3f3e4b0a25 Updated compile dependency list
To compile pgloader on a freshly bootstraped VM, it is necessary
        to provide the exhaustive list of Debian packages to install.
2014-02-17 11:10:08 +01:00
Dimitri Fontaine
643875a266 Improve CSV error handling, thanks to cl-csv continue restart. 2014-02-08 17:51:15 +01:00
Dimitri Fontaine
8f6915d626 Fix issur #29, using proper quoting.
The patch from pull request #30 was hard-coding the PostgreSQL side quoting,
we are using the quote_ident() function instead, as it's now available in
every PostgreSQL production release (8.4 included).
2014-02-08 17:31:59 +01:00
Dimitri Fontaine
a6e2c6364f Cleanup: the MySQL list-transform function is not used anymore. 2014-02-08 17:28:04 +01:00
Dimitri Fontaine
dbfd8cf06c Implement new CSV option "lines terminated by", fixes #23. 2014-02-04 20:58:46 +01:00
Dimitri Fontaine
01c67f7625 Merge pull request #28 from fpauser/fpauser_typo_fixes
Fixed some typos
2014-02-04 01:00:59 -08:00
Falk Pauser
219aedd4de Fixed some typos 2014-02-04 00:30:48 +01:00
Dimitri Fontaine
8100b2b985 Typo fix in the Install docs. 2014-02-03 21:26:33 +01:00
Dimitri Fontaine
ee20bdd0b5 Document the "fields not enclosed" option... 2014-01-29 17:26:32 +01:00
Dimitri Fontaine
1844f40ad1 Fix map-push-queue to ensure we send an :end-of-data message no matter what. 2014-01-28 21:05:37 +01:00
Dimitri Fontaine
a8b0f91f37 Allow optional control of batch memory footprint, see #16 and #22.
With the new internal setting *copy-batch-size* it's now possible to
instruct pgloader to close batches early (before *copy-batch-rows* limit)
when crossing the byte count threshold.

When set to 20 MB it allows the new test case (exhausted) to pass under SBCL
and CCL, and there's no measurable cost when *copy-batch-size* is set to
nil (its default value) in the testing done.

This patch is published without any way to tune the values from the command
language yet, that's the next step once its been proven effective.
2014-01-26 23:22:18 +01:00
Dimitri Fontaine
ceec4780f2 Improve log message pointing to the log file (use the true name). 2014-01-26 21:25:27 +01:00
Dimitri Fontaine
ca0d25d3b2 Provide a new log level, :data, activated when both --debug and --verbose are used. 2014-01-26 17:49:20 +01:00
Dimitri Fontaine
b60f40a5fa Fix transform function date-with-no-separator. 2014-01-26 17:48:45 +01:00
Dimitri Fontaine
db947e1467 Rework reader and writer data exchange.
With this patch, the whole data massaging and final formating into the
PostgreSQL COPY TEXT format is done by the reader thread, which publishes a
batch at a time in the communication channel: a lparallel.queue object.

Before that, the raw vectors where pushed directly in the queue, offering
more flexibility to adjust to the reader and writer IO rates and
capabilities, but impeding the ability of the Garbage Collector: data still
in the queue was not collected even if not needed anymore.

The new model also uses less memory, and allows a better control over what
amount of data stays in memory. The new *concurrent-batches* parameter
should be key to being able to process huge rows.

The intent is to offering a way for the users to tune *concurrent-batches*
down to 1 for sources with massive per-row memory footprint. Even better
would be to find a way to automatically adjust the setting without spending
too much time counting the bytes we're batching.

Preliminary tests show no sensible impact on performances from this patch,
even some improvements in cases.
2014-01-25 23:54:49 +01:00
Dimitri Fontaine
41add15397 In passing indentation change only. 2014-01-25 23:41:10 +01:00
Dimitri Fontaine
8ac2cc4930 Skip empty lines when reading from files. 2014-01-24 15:11:15 +01:00
Dimitri Fontaine
e92f085b04 Convert --root-dir to its truename before processing it, and manage errors to do so. 2014-01-24 15:10:45 +01:00
Dimitri Fontaine
c50164e53d Manage the whole class of "integrity errors" also when retrying a batch... 2014-01-24 15:10:03 +01:00
Dimitri Fontaine
69b550a46e Make use of the new usage function... 2014-01-24 10:14:51 +01:00
Dimitri Fontaine
be4cc804c0 Show usage and help when the command line options are not recognized. 2014-01-24 09:22:02 +01:00
Dimitri Fontaine
e8fcb15c27 Fix another hasty commit erroneously containing a for-tests change. 2014-01-23 23:29:27 +01:00
Dimitri Fontaine
b374d4bc8b The current retry method has no need for *copy-batch-split*. 2014-01-23 23:28:25 +01:00
Dimitri Fontaine
d132bafc07 Refrain from parsing a non-existing command file... 2014-01-23 23:17:34 +01:00
Dimitri Fontaine
3f61c66a79 Also handle extra columns in CSV parsing. 2014-01-23 15:15:42 +01:00
Dimitri Fontaine
516ef08c37 Allow loading ragged CSV files. 2014-01-23 15:07:05 +01:00
Dimitri Fontaine
4cbe4b3218 Manage the whole class of "integrity violation" errors. 2014-01-23 14:59:46 +01:00
Dimitri Fontaine
59e87b84a0 Release Candidate 8. 2014-01-23 00:26:08 +01:00
Dimitri Fontaine
9455752805 Review and simplify batch retry processing. 2014-01-23 00:15:57 +01:00
Dimitri Fontaine
7c238f45f2 Fix batch retry handling, broken in previous refactoring. Fixes #22. 2014-01-22 22:52:18 +01:00
Dimitri Fontaine
a5c661dd4a Cleanup error recovery logging. 2014-01-22 17:57:56 +01:00
Dimitri Fontaine
ccb888164c Fix where to find relative filenames from within commands. 2014-01-22 17:11:01 +01:00
Dimitri Fontaine
c13d7bbae7 Bug fix when processing plain filenames. 2014-01-22 11:00:12 +01:00
Dimitri Fontaine
a51a712b6a Fix asd dependencies, cleanup useless and misplaced compilation options. 2014-01-21 14:37:26 +01:00
Dimitri Fontaine
6e4a3e2165 Fix parsing COPY error message without column information, see issue #22. 2014-01-20 17:19:41 +01:00
Dimitri Fontaine
c56bbab0c4 Fix #24 by allowing cast rules adding only transformation functions. 2014-01-20 16:00:09 +01:00
Dimitri Fontaine
e888b15513 Update the docs, we list the default MySQL cast there (see issue #22). 2014-01-16 10:10:23 +01:00
Dimitri Fontaine
afc64cc30d Cast MySQL smallint with auto_increment to PostgreSQL serial, fixes #22. 2014-01-16 10:07:21 +01:00
Dimitri Fontaine
6431edcd51 Update download page to link to the 3.0.97 binaries. 2014-01-15 23:01:27 +01:00
Dimitri Fontaine
80b6c46aae Version 3.0.97. 2014-01-15 22:53:43 +01:00
Dimitri Fontaine
aa49e8eec2 Fix the log-filename when operating from the command line. 2014-01-15 22:48:45 +01:00
Dimitri Fontaine
6ccb1871f5 Fix parsing dotted hostnames. 2014-01-15 10:48:56 +01:00
Dimitri Fontaine
07c614c170 Switch to the newer cl-csv API.
Thanks to the work at https://github.com/AccelerationNet/cl-csv/pull/12 we
can now use the main branch of cl-csv again.
2014-01-11 18:28:24 +01:00
Dimitri Fontaine
539ad57347 HTML fixes for pgloader.tapoueh.org 2014-01-06 17:01:08 +01:00
Dimitri Fontaine
ad7e3a1b9d Update copyright information. 2014-01-04 22:38:13 +01:00
Dimitri Fontaine
158b9cd79c Fix the centos bootstrap script. 2014-01-03 16:49:31 +01:00
Dimitri Fontaine
13d53593b1 Fix buildapp to require some SBCL extensions, fix ASDF setup. 2014-01-03 16:49:08 +01:00
Dimitri Fontaine
2080d91e40 Fix dependency declarations in between files, should help with #19. 2014-01-02 23:48:57 +01:00