54 Commits

Author SHA1 Message Date
Dimitri Fontaine
55584406fa Add encoding support for db3 sources, fix #176.
It appears that db3 files are not limited to the ASCII character
encoding that they were designed with, so let's clue pgloader about
that.

This commit build
770cbe3526
and the pgloader Makefile has been updated to momentarily fetch cl-db3
from github rather than Quicklisp so that it's possible to enjoy the new
feature immediately.
2015-02-18 22:40:03 +01:00
Dimitri Fontaine
cd46b6cbed Clean up common code for sources.
Only move code around, creating a src/sources/common directory with
several files in there so as to split the too big src/sources.lisp.
2015-01-08 23:17:40 +01:00
Dimitri Fontaine
e1bc6425e2 Implement support for PostgreSQL COPY format, fix #145.
PostgreSQL COPY format is not really CSV but something way easier to
parse. Funnily enough, parsing it as CSV is not that easy, so we add
here a special simple parser for the COPY format.

It should be quite useful too try loading again reject data files from
pgloader after manual fixing, too. It's still missing some documentation
without any good excuse for that, will add soon.
2015-01-02 18:49:17 +01:00
Dimitri Fontaine
302a7d402b Refactor connection handling, and clean-up many things.
That's the big refactoring patch I've been sitting on for too long.

First, refactor connection handling to use a uniformed "connection"
concept (class and generic functions API) everywhere, so that the COPY
derived objects just use that in their :source-db and :target-db slots.

Given that, we don't need no messing around with *pgconn* and *myconn-*
and other special variables at all anywhere in the tree.

Second, clean up some oddities accumulated over time, where some parts
of the code didn't get the memo when new API got into place.

Third, fix any other oddity or missing part found while doing those
first two activities, it was long overdue anyway...
2014-12-26 21:50:29 +01:00
Dimitri Fontaine
87e157bee2 Add a new database source type in the parser.
Now it's possible to parse a command to load data from MS SQL. The
parser was until now parsing all database URI within the same common
rule and that isn't possible anymore if we want to distinguish in
between source database right from the parser, which we actually want to
do.

This patch also implement in-passing fixes all over the place, including
the transformation function float-to-string that only happened to work
on double-float data.
2014-11-17 00:23:06 +01:00
Dimitri Fontaine
fff756f95f Refactor the command parser.
Split its content into separate files, so that each is easier to
maintain, and to make it easier also to add support for new sources.
2014-11-16 22:22:04 +01:00
Dimitri Fontaine
03bba5f486 Some more SQL Server support (schema conversion).
Converting the table definitions (with type casting) seems to work. Also
did experiment a little with actuallt fetching some data... and had to
edit the cl-mssql driver, which is temporarily monkey patched.
2014-11-10 01:16:10 +01:00
Dimitri Fontaine
ca325ba799 Refactor the SQLite source files. 2014-11-09 22:59:30 +01:00
Dimitri Fontaine
6473a892d4 First steps toward MS SQL compatibility. 2014-11-09 00:09:42 +01:00
Dimitri Fontaine
ed853a7bea Allow pgloader to work on windows. 2014-11-06 22:12:20 +01:00
Dimitri Fontaine
3c334dcdc4 Refactor the main parser to use the bind macro.
The metabang-bind lib offers a nice bind macro that solves the problem
of ignoring bindings in destructuring-bind, and allows a let* approach
to nested destructuring (wven when mixed with let declarations).

Using that lib (that we already indirectly depend on anyway) simplifies
the parser code substantially.
2014-10-02 17:05:35 +02:00
Dimitri Fontaine
7cf7e714fc Implement the source date format option. 2014-10-02 01:03:24 +02:00
Dimitri Fontaine
2369a142a7 Refactor source code organisation.
In passing, fix a bug in the previous commit where left-over code would
cancel the whole new parsing code for advanced source fields options.
2014-10-01 23:20:24 +02:00
Dimitri Fontaine
422f87e912 We don't use the zip system anymore. 2014-09-10 22:19:59 +02:00
Dimitri Fontaine
3e0526c957 Implement early support for IXF files. 2014-07-14 21:53:50 +02:00
Dimitri Fontaine
807f5cefcd Fix omitted file dependency (reading queries from file). 2014-06-16 14:24:05 +02:00
Dimitri Fontaine
c3742a9410 Typo fix cl-base64 system's name, fix the fix for #60. 2014-05-16 23:36:45 +02:00
Dimitri Fontaine
9e12035ca1 Review SQLite blob types in light of "manifest typing", fix #60.
When using SQLite 3, a blob column might return either string of byte
vector values dynamically depending on the data itself, or maybe some
more complex parameters controlled at data insert time.

Hard-code the rule that a blob column returned as a string is in fact
base64 encoded (which looks like common practice) and decode it
automatically when needed, before sending to byte-vector-to-bytea. It
might be a tad slow but at least the data is properly converted.

In future, that decision might come and byte us in the back again, at
which point it'll be necessary to consider full casting options as in
the MySQL CAST rules. It seems like a big enough win for now if we can
avoid that.
2014-05-16 23:13:57 +02:00
Dimitri Fontaine
35ca4927e9 Get rid of some lib dependencies.
The charset business isn't worth depending on an AGPL licenced lib which
is part of a huge Quicklisp system.
2014-04-25 17:21:11 +02:00
Dimitri Fontaine
4d6def8105 Move some MySQL old import/export functions apart... 2014-03-04 13:52:48 +01:00
Dimitri Fontaine
db947e1467 Rework reader and writer data exchange.
With this patch, the whole data massaging and final formating into the
PostgreSQL COPY TEXT format is done by the reader thread, which publishes a
batch at a time in the communication channel: a lparallel.queue object.

Before that, the raw vectors where pushed directly in the queue, offering
more flexibility to adjust to the reader and writer IO rates and
capabilities, but impeding the ability of the Garbage Collector: data still
in the queue was not collected even if not needed anymore.

The new model also uses less memory, and allows a better control over what
amount of data stays in memory. The new *concurrent-batches* parameter
should be key to being able to process huge rows.

The intent is to offering a way for the users to tune *concurrent-batches*
down to 1 for sources with massive per-row memory footprint. Even better
would be to find a way to automatically adjust the setting without spending
too much time counting the bytes we're batching.

Preliminary tests show no sensible impact on performances from this patch,
even some improvements in cases.
2014-01-25 23:54:49 +01:00
Dimitri Fontaine
a51a712b6a Fix asd dependencies, cleanup useless and misplaced compilation options. 2014-01-21 14:37:26 +01:00
Dimitri Fontaine
2080d91e40 Fix dependency declarations in between files, should help with #19. 2014-01-02 23:48:57 +01:00
Dimitri Fontaine
17b366ca82 Create a website to present the software. 2014-01-02 23:25:23 +01:00
Dimitri Fontaine
b2c9e0d2dc Refactor the whole logging infrastructure not to depend on threads sharing streams. 2013-12-24 19:08:55 +01:00
Dimitri Fontaine
f02eb641b4 Switch from cl-mysql to qmynd, an all-lisp driver for MySQL. 2013-12-03 22:05:39 +01:00
Dimitri Fontaine
3486cc688f Looks like I forgot to add fixed.lisp in the asd system definitions. 2013-11-08 21:50:40 +01:00
Dimitri Fontaine
5ce5d53d7d Use trivial-backtrace to display more useful information in case of unexpected events, hopefully. 2013-11-07 20:14:06 +01:00
Dimitri Fontaine
6a75187b7d Refactor MySQL to use the new API. 2013-11-04 19:16:08 +01:00
Dimitri Fontaine
0a38195853 Refactoring the API with a real definition of it, and reorg the source tree. 2013-11-04 13:21:45 +01:00
Dimitri Fontaine
50114a0d3a Hack-in some support for SQLite data source, including some refactoring preps. 2013-10-24 00:21:46 +02:00
Dimitri Fontaine
ffebcf3bc7 Clean out the code by splitting away a bunch of PostgreSQL related facilities. 2013-10-21 22:35:22 +02:00
Dimitri Fontaine
fb818ee0e3 Move sources into their own subdirectory, assorted cleaning. 2013-10-20 19:09:09 +02:00
Dimitri Fontaine
6d27d28287 Implement a converter from old .INI syntax to current commands. 2013-10-12 23:59:28 +02:00
Dimitri Fontaine
2bf7c4df12 Assorted clean up to prepare a binary image. 2013-10-03 17:42:09 +02:00
Dimitri Fontaine
2ff0d11332 Fix a typo in the com.informatimago.clext ASD dependency declaration. 2013-09-30 17:31:28 +02:00
Dimitri Fontaine
2a6c974f8e Handle input file encodings. 2013-09-30 00:26:41 +02:00
Dimitri Fontaine
b4e530981c WIP implementing full archive fetching and downloading. 2013-09-24 18:34:05 +02:00
Dimitri Fontaine
7151a2ea62 Refactor transaction handling, depend on a patch to postmodern. 2013-09-23 11:30:20 +02:00
Dimitri Fontaine
e6d4c73c1b Make Xach's db3 lib into its own asdf piece and integrate it with pgloader. 2013-09-19 00:42:35 +02:00
Dimitri Fontaine
6e4edc4560 Split ABNF implementation into its own Quicklisp ready system. 2013-09-09 14:03:28 +02:00
Dimitri Fontaine
725c66f278 First stab at the ABNF parser generator, for easy user edits of syslog message grammar. 2013-09-05 00:14:55 +02:00
Dimitri Fontaine
1f42318bb2 Import DBF v3 file reader from Xach, with permissions. 2013-08-31 23:11:54 +02:00
Dimitri Fontaine
ff275e69f1 Preliminary parsing of syslog messages. 2013-08-29 11:42:49 +02:00
Dimitri Fontaine
f3b6054432 Begin working on importing from zip files with plenty of wild guessing... 2013-08-19 23:38:58 +02:00
Dimitri Fontaine
5ed766c570 Fully integrate data transformation rules. 2013-08-07 18:42:48 +02:00
Dimitri Fontaine
7b2b208e59 Actually produce code from the LOAD DATABASE FROM command. 2013-08-05 18:23:27 +02:00
Dimitri Fontaine
800df8e91d Use the new casting rules facilities in mysql.lisp 2013-07-30 21:15:26 -07:00
Dimitri Fontaine
22246ccd2d Add a COPY command parser, using esrap. 2013-05-09 15:44:17 +02:00
Dimitri Fontaine
13fcf9f096 Clean-up connection parameters and their default values. 2013-03-16 23:21:32 +01:00