Commit Graph

61 Commits

Author SHA1 Message Date
Dimitri Fontaine
b4bfa18877 Fix more table name quoting, fix #163 again.
Now that we can have several threads doing COPY, each of them need to
know about the *pgsql-reserved-keywords* list. Make sure that's the case
and in passing fix some call sites to apply-identifier-case.

Also, more disturbingly, fix the code so that TRUNCATE is called from
the main thread before giving control to the COPY threads, rather than
having two concurrent threads doing the TRUNCATE twice. It's rather
strange that we got no complaint from the field on that part...
2015-12-08 11:52:43 +01:00
Dimitri Fontaine
150d288d7a Improve our regression testing facility.
Next parallelism improvements will allow pgloader to use more than one
COPY thread to load data, with the impact of changing the order of rows
in the database.

Rather than doing a copy out and `diff` of the data just loaded, load
the reference data and do the diff in SQL:

          select * from loaded.data
  except
          select * from expected.data

If such a query returns any row, we know we didn't load what was
expected and the regression test is failing.

This regression testing facility should also allow us to finally add
support for multiple-table regression tests (sqlite, mysql, etc).
2015-11-17 17:03:08 +01:00
Dimitri Fontaine
6ca376ef9b Simplify the main function (refactor).
Move some code away in its own function for easier review and
modifications of the main entry point.
2015-11-16 16:01:25 +01:00
Dimitri Fontaine
c3726ce07a Refrain from starting the logger twice in load-data. 2015-10-05 21:27:48 +02:00
Dimitri Fontaine
96a33de084 Review the stats and reporting code organisation.
In order to later be able to have more worker threads sharing the
load (multiple readers and/or writers, maybe more specialized threads
too), have all the stats be managed centrally by a single thread. We
already have a "monitor" thread that get passed log messages so that the
output buffer is not subject to race conditions, extend its use to also
deal with statistics messages.

In the current code, we send a message each time we read a row. In some
future commits we should probably reduce the messaging here to something
like one message per batch in the common case.

Also, as a nice side effect of the code simplification and refactoring
this fixes #283 wherein the before/after sections of individual CSV
files within an ARCHIVE command where not counted in the reporting.
2015-10-05 01:46:29 +02:00
Dimitri Fontaine
ea35eb575d Implement --dry-run option, fix #264.
The dry run option will currently only check database connections, but
as that happens after having correctly parsed the load file, it allows
to also check that the command file is correct for the parser.

Note that the list load-data API isn't subject to the dry-run method.

In passing, we add some more API entry points to the connection objects
and we should actually clean the code base to use the new QUERY generic
all over the place. It's for another patch tho.
2015-08-22 16:23:47 +02:00
Dimitri Fontaine
322f7dd8b5 Improve logging when loading extra code, see #245. 2015-06-11 13:02:29 +02:00
Dimitri Fontaine
62931e0312 Fix unknown source type error, fix #237.
In passing also recognize the ".sqlite3" file type as being a SQLite
database file.
2015-05-22 11:32:30 +02:00
Dimitri Fontaine
533e6b623f Review upgrade config code, fix #235.
The database connection code needed to switch to the "new" connection
facilities, and there was a bug in the processing of template sections
wherein the template user would inherit the template property.
2015-05-19 18:12:10 +02:00
Dimitri Fontaine
5ac396799a Be careful about the OS return code, fix #190.
Define a bunch of OS return codes and use them wisely, or at least in a
better way than just doing (uiop:quit) whenever there's something wrong,
without any difference whatsover to the caller.

Now we return a non-zero error code when we know something wrong did
happen. Which is more useful.
2015-04-17 22:30:04 +02:00
Dimitri Fontaine
290916b0f0 Attempt at fixing --self-upgrade.
The option currently only works within the same build environment where
the image was first build, as noted in #133. This is an attempt at
convincing ASDF not to load systems that pgloader depends on in order to
be able to load only the new pgloader definition.

While it looks sound in principle, I failed to have it work in the lab.
Given that previous to this patch nothing works at all, it's not a
regression, let's push it as is makes the code saner.

Also, it looks like asdf::*immutable-systems* is what we want here, but
that's asdf 3.1.x and we're not there yet.
2015-01-14 20:54:11 +01:00
Dimitri Fontaine
2caefb0836 Fix and improve new summary reporting. 2015-01-06 12:36:14 +01:00
Dimitri Fontaine
ad8fb0b2a4 Implement machine readable summary files, fixes #144.
It's now possible to have pgloader print out its summary in one of
several formats: human-readable (default), csv, copy or json. The
choice of format is made depending on the extension of the summary
filename picked on the command line with the option --summary.
2015-01-06 01:22:01 +01:00
Dimitri Fontaine
d9f5bff5e0 Cleanup some code location. 2015-01-06 01:18:54 +01:00
Dimitri Fontaine
a86369a03d Improve the CLI situation a bit.
Fix bugs related to parsing the new COPY type, and make it so that we
know how to parse options (and fields, and other type dependant things)
even when --type is missing, in care the source URL has the information.
2015-01-06 00:07:31 +01:00
Dimitri Fontaine
e1bc6425e2 Implement support for PostgreSQL COPY format, fix #145.
PostgreSQL COPY format is not really CSV but something way easier to
parse. Funnily enough, parsing it as CSV is not that easy, so we add
here a special simple parser for the COPY format.

It should be quite useful too try loading again reject data files from
pgloader after manual fixing, too. It's still missing some documentation
without any good excuse for that, will add soon.
2015-01-02 18:49:17 +01:00
Dimitri Fontaine
40f3c4f769 Improve HTTP handling of CSV and Fixed data sources.
In passing also allow --field to specify the whole field list, there's
no point in forcing the user to have as many --field switches on the
command line as they have columns in their data source file.
2014-12-27 17:02:19 +01:00
Dimitri Fontaine
302a7d402b Refactor connection handling, and clean-up many things.
That's the big refactoring patch I've been sitting on for too long.

First, refactor connection handling to use a uniformed "connection"
concept (class and generic functions API) everywhere, so that the COPY
derived objects just use that in their :source-db and :target-db slots.

Given that, we don't need no messing around with *pgconn* and *myconn-*
and other special variables at all anywhere in the tree.

Second, clean up some oddities accumulated over time, where some parts
of the code didn't get the memo when new API got into place.

Third, fix any other oddity or missing part found while doing those
first two activities, it was long overdue anyway...
2014-12-26 21:50:29 +01:00
Dimitri Fontaine
3362a1da19 Improve source and target uri parsing.
In particular, make dbf and db3 synonyms as far as --type is concerned.
2014-12-23 17:32:55 +01:00
Dimitri Fontaine
6eac0d9dd8 Implement --before and --after options on the command line.
That allows using SQL scripts to run before and after the main data
processing and loading done by pgloader when used only from the command
line.
2014-12-23 12:21:44 +01:00
Dimitri Fontaine
65c2043694 Improve pgloader usage from the command line.
Make it so that the following command line usages are accepted when
using pgloader without a command file:

 ./build/bin/pgloader ./test/sqlite/sqlite.db postgresql:///pgloader

 ./build/bin/pgloader --set "search_path='sakila'"  \
                      mysql://root@localhost/sakila \
                 postgresql:///sakila

 ./build/bin/pgloader --type csv                             \
                      --field id --field field               \
                      --with truncate                        \
                      --with "fields terminated by ','"      \
                      ./test/data/matching-1.csv             \
                      postgres:///pgloader?matching

It's now possible in most cases to just use command-line options, which
should make the entry bar to pgloader much lower.
2014-12-23 02:40:13 +01:00
Dimitri Fontaine
ed853a7bea Allow pgloader to work on windows. 2014-11-06 22:12:20 +01:00
Dimitri Fontaine
4916e67a9e Fix --root-dir debian reported bug #767288 2014-11-03 14:08:51 +01:00
Dimitri Fontaine
9c604f969b Rename --load into --load-lisp-file
To avoid wasting everybody's time when trying to debug --load
command.load, rename the option to be more explicit about what it does.
Also implement some basic guards in the form of testing that the
filename extension is part of a very short whitelist: .lisp, .cl, .lsp
and .asd.
2014-09-02 22:33:51 +02:00
Dimitri Fontaine
a41f8ea6d3 Create root-dir when it does not exists, fix #35. 2014-08-29 23:19:01 +02:00
Dimitri Fontaine
d00837f8fc Fix --upgrade-config basic usage. 2014-07-13 16:35:53 +02:00
Dimitri Fontaine
de4ff30acc Implement --summary to copy the output to a file, fix #68.
Given than redirecting a tty such as *terminal-io* isn't easy enough,
let's provide a way to copy the summary output to a file. Another way to
solve it would have been to output the summary to the main logs, but
that could have made the logs parsing more difficult that necessary.

Let's see how users like it...
2014-06-14 23:31:11 +02:00
Dimitri Fontaine
e93ba8b887 Fix handling --client-min-messages and --log-min-messages.
Should help with issue #67 by allowing --client-min-messages to
effectively control entering the debugger in case of unhandled
conditions, etc.

Contrary to the discussion, in this patch --log-min-messages has no
impact on the behavior of the console and interactive behaviors.
2014-05-28 16:37:38 +02:00
Dimitri Fontaine
6e58db2994 Improve self-upgrading.
There's no reason not to parse again the command line with the newly
loaded code actually, so be sure to do the self-upgrade dance first
thing and recurse to the pgloader::main function (with a guard).
2014-05-03 15:22:34 +02:00
Dimitri Fontaine
fecae2c2d9 Implement --self-upgrade capacity.
As from now, to install a new version of pgloader when you have an older
one, say because there's that bug that got fixed meanwhile, all you need
to do is run

  $ git clone https://github.com/dimitri/pgloader.git /tmp/pgloader
  $ pgloader --self-upgrade /tmp/pgloader <options as usual>

Any Common Lisp developper using the product is already doing that many
times a day, it might prove useful for users to be able to hot-patch
themselves too, after all.
2014-05-03 00:25:44 +02:00
Dimitri Fontaine
c0d9bb4d8f Allows to build pgloader image using CCL.
Too many Makefile commands where hard-coded using SBCL, which prevented
from building successfully against CCL. That's now fixed.
2014-04-29 11:47:22 +02:00
Dimitri Fontaine
35ca4927e9 Get rid of some lib dependencies.
The charset business isn't worth depending on an AGPL licenced lib which
is part of a huge Quicklisp system.
2014-04-25 17:21:11 +02:00
Dimitri Fontaine
ceec4780f2 Improve log message pointing to the log file (use the true name). 2014-01-26 21:25:27 +01:00
Dimitri Fontaine
ca0d25d3b2 Provide a new log level, :data, activated when both --debug and --verbose are used. 2014-01-26 17:49:20 +01:00
Dimitri Fontaine
e92f085b04 Convert --root-dir to its truename before processing it, and manage errors to do so. 2014-01-24 15:10:45 +01:00
Dimitri Fontaine
69b550a46e Make use of the new usage function... 2014-01-24 10:14:51 +01:00
Dimitri Fontaine
be4cc804c0 Show usage and help when the command line options are not recognized. 2014-01-24 09:22:02 +01:00
Dimitri Fontaine
d132bafc07 Refrain from parsing a non-existing command file... 2014-01-23 23:17:34 +01:00
Dimitri Fontaine
aa49e8eec2 Fix the log-filename when operating from the command line. 2014-01-15 22:48:45 +01:00
Dimitri Fontaine
f2bec5fcd1 Fix the main command line "driver" to use the new with-monitor API. 2013-12-24 19:54:22 +01:00
Dimitri Fontaine
aca04b1514 Fix problem found when trying to load the code with CCL. 2013-12-19 23:08:02 +01:00
Dimitri Fontaine
32d91d7054 Return a non-zero error code to the OS when something unexpected did happen. 2013-11-25 11:25:57 +01:00
Dimitri Fontaine
c482015248 Arrange to error out when given non-existing file names in the command line. 2013-11-23 21:36:22 +01:00
Dimitri Fontaine
2a344ab7ce Add a usage line in the --help output. 2013-11-22 11:23:15 +01:00
Dimitri Fontaine
1419a1f65d Default --logfile to pgloader.log within --root-dir. 2013-11-21 14:28:11 +01:00
Dimitri Fontaine
516bb28ec0 Fix using TMPDIR from the environment when running from the binary image. 2013-11-21 13:06:16 +01:00
Dimitri Fontaine
d70ed07b12 If --lofgile is relative, expand it into --root-dir. 2013-11-21 12:56:54 +01:00
Dimitri Fontaine
febbc03459 Review some log messages and levels. 2013-11-21 11:37:47 +01:00
Dimitri Fontaine
d52240a95e Review the --root-dir patches for better default management.
Also ensure the directory we're given actually exists on disk, creating it
if necessary, and bail out early in case for whatever reason it's not
possible to create the directory.
2013-11-21 11:27:54 +01:00
Cédric Villemain
081f6be0b4 Add new command line parameter for root-dir
This variable replaces reject-root-path and is used to set the root working
directory.

It defaults to /tmp/pgloader/ like previously.

Also set the logfile according to the root-dir.

TODO: tmpdir is not handled in comand-line. Is it really wanted to have more
command line parameters ?
2013-11-21 11:00:02 +01:00