pgloader

mirror of https://github.com/dimitri/pgloader.git synced 2026-01-24 16:41:03 +01:00

Author	SHA1	Message	Date
Dimitri Fontaine	ee498111bc	Implement MySQL local (socket) connection. Fix #39 . The parser was happily parsing such a connection string as the following, but the rest of the code didn't really know what to do about it: mysql://unix:/var/run/mysqld/mysqld.sock:/main In passing, fix bugs where the PostgreSQL unix domain socket connection was still shy of a brick load, omitting to consider the case where the connection host is actually a list of '(:unix . "path/to/socket").	2014-05-02 22:48:17 +02:00
Dimitri Fontaine	182128775b	Another encoding and external formats fix for portability. Some of our internal values now depend on the implementation, and could either be a symbol on SBCL or an external-format structure on CCL. We could typecase our way out I suppose, but it might be that SBCL has a different version of the external-format type, so we'd rather use #+.	2014-04-29 15:25:56 +02:00
Dimitri Fontaine	f0cc4ddef9	Fix filename matching when no match is found.	2014-04-29 14:49:55 +02:00
Dimitri Fontaine	f5f584fdf1	Fix parsing ccl:describe-character-encodings. First, despite the documentation mentionning the function writes to terminal-io, in fact it's doing (format t ...) and thus the result is written to standard-output. Second, CCL has encodings with no aliases.	2014-04-29 14:25:40 +02:00
Dimitri Fontaine	a5a29407f0	Release pgloader version 3.0.99.	2014-04-29 13:59:33 +02:00
Dimitri Fontaine	c0d9bb4d8f	Allows to build pgloader image using CCL. Too many Makefile commands where hard-coded using SBCL, which prevented from building successfully against CCL. That's now fixed.	2014-04-29 11:47:22 +02:00
Dimitri Fontaine	40128dbd75	Fix with-monitor support of :start-logger option. It used to still launch an extra set of threads for monitoring where, and that would confuse CCL where it's not possible to write into a stream from more than one thread concurrently.	2014-04-29 11:43:03 +02:00
Dimitri Fontaine	0f62751a3f	Improve summary output. Try at having a deterministic ouput of it, which still apparently is not always the case when using SBCL, now that it's been switched to using the explicit terminal-io rather than t. This change is needed for CCL support, though, where you don't get to write to the same stream from different threads.	2014-04-29 11:42:02 +02:00
Dimitri Fontaine	3abcfeb569	Avoid empty index definitions in SQLite, fixes #52 . I could get down to the problem here, which is that a couple of indexes where reported to pgloader but without any SQL definition for them, and then pgloader would wait for non existing tasks. It seems easier to just skip does indexes, that's what this patch does.	2014-04-28 16:00:34 +02:00
Dimitri Fontaine	9516a90d9d	Fix SQLite support for filename parsing. The code didn't get the memo about the way we now do support source filenames and all.	2014-04-28 15:20:30 +02:00
Dimitri Fontaine	b758058208	Fix the fix for parsing quoted-filenames.	2014-04-28 15:18:18 +02:00
Dimitri Fontaine	b5c89e750c	Quick review of the generic API documentation strings.	2014-04-28 14:36:15 +02:00
Dimitri Fontaine	429232c3de	Fix loading data from stdin: fix #53 . The stdin support really was one brick shy of a load, and in particular with-open-file was used against a stream when using that option.	2014-04-27 23:38:02 +02:00
Dimitri Fontaine	b5dec87915	Allow any non-quote characters in a quoted filename. In particular, allow for a space to be used in the filename. The only character that is not permitted anymore is the quote itself ('), it should be easy enough to allow for escaping it as in the password field if required. Should probably fix #54, even though the lack of data currently reported in that issue makes it a blind guess only.	2014-04-27 22:49:27 +02:00
Dimitri Fontaine	efd11ab759	Add user options to control pgloader batch behaviour. The new WITH options allows the user to set values for the dynamic variables copy-batch-rows, copy-batch-size and concurrent-batches. That's needed in case like in issue #16 even with the batch size defaulting to what looks like a proper setup. In a longer term a review of the pgloader memory usage should be done seriously, the numbers being way higher than the batch sizes we do setup here.	2014-04-27 22:37:17 +02:00
Dimitri Fontaine	78a988eb47	Oops, forgot to add the new file charsets.lisp.	2014-04-26 18:55:43 +02:00
Dimitri Fontaine	35ca4927e9	Get rid of some lib dependencies. The charset business isn't worth depending on an AGPL licenced lib which is part of a huge Quicklisp system.	2014-04-25 17:21:11 +02:00
Dimitri Fontaine	789d854799	Fix issue #49 where data could be considered as a format string.	2014-04-23 17:03:35 +02:00
Dimitri Fontaine	3a9bc9db0f	Switch the default memory watch to on.	2014-04-22 17:13:36 +02:00
Dimitri Fontaine	9fa638e233	Handle NIL values in transform functions. When declaring types of arguments (mainly done for hinting the Common Lisp compiler into generating more efficient code), it's important to account for the possibility of the arguments being NIL, of NULL type. That's been made clear in the way the projection function is now generated in src/sources/source.lisp in project-fields function, with all the arguments now being &optional so that we are able to cope with ragged CSV files. The only expected change from this patch is missing warnings in some test cases, such as test/reformat.load, test/fixed.load and test/archive.load.	2014-04-18 22:51:30 +02:00
Dimitri Fontaine	1af517323c	Attempt to fix the OpenSSL loading situation. For the generated binary to be really portable, we need to be able to open openssl 1.0.1 even when we've been built against openssl 1.0.0. A way to achieve that with SBCL is by forcing the unloading of the lib at image saving time and register a hook to load it again at image init time. Using the proper API, CFFI will happily load the available file for the lib rather than insisting on loading the exact same one than found on the build machine.	2014-04-18 22:24:11 +02:00
Dimitri Fontaine	114d2fedbc	Another try at fixing #40 . The babel character-decoding-error condition is exposing both its internal BUFFER and the current OCTETS, and it seems we should refer to the BUFFER in our error reporting...	2014-03-04 15:54:31 +01:00
Dimitri Fontaine	654b3f5531	Fix the condition handler fix for #40 . Refrain from trying to display the character where we found a decoding error when the error actually happens at end-of-input-in-character...	2014-03-04 14:20:27 +01:00
Dimitri Fontaine	56f3da28ed	Fix #20 by skipping table and view missing from the catalogs.	2014-03-04 14:01:04 +01:00
Dimitri Fontaine	4d6def8105	Move some MySQL old import/export functions apart...	2014-03-04 13:52:48 +01:00
Dimitri Fontaine	46fd6632f2	Fix #40 by providing a per-table forced-encoding option. This patch takes benefits from the recent patch `62fc85a1cf` so that you will need to freshen your local Qmynd copy if you want to test from sources.	2014-03-03 23:39:22 +01:00
Dimitri Fontaine	1461cda1c0	Improve MySQL encoding errors handling. When it's not possible to decode a MySQL value in the proper given encoding, automatically replace the value with nil and be quite verbose about it by logging an error.	2014-03-02 22:44:06 +01:00
Dimitri Fontaine	42635c70bd	Refrain from controling the encoding in pgloader, qmynd now handles it.	2014-03-02 01:27:02 +01:00
Dimitri Fontaine	7fa95c1135	Fix bug #39 wherein unix domain sockets didn't make it properly to cl-postgres.	2014-02-24 17:23:17 +01:00
Dimitri Fontaine	643875a266	Improve CSV error handling, thanks to cl-csv continue restart.	2014-02-08 17:51:15 +01:00
Dimitri Fontaine	8f6915d626	Fix issur #29 , using proper quoting. The patch from pull request #30 was hard-coding the PostgreSQL side quoting, we are using the quote_ident() function instead, as it's now available in every PostgreSQL production release (8.4 included).	2014-02-08 17:31:59 +01:00
Dimitri Fontaine	a6e2c6364f	Cleanup: the MySQL list-transform function is not used anymore.	2014-02-08 17:28:04 +01:00
Dimitri Fontaine	dbfd8cf06c	Implement new CSV option "lines terminated by", fixes #23 .	2014-02-04 20:58:46 +01:00
Dimitri Fontaine	1844f40ad1	Fix map-push-queue to ensure we send an :end-of-data message no matter what.	2014-01-28 21:05:37 +01:00
Dimitri Fontaine	a8b0f91f37	Allow optional control of batch memory footprint, see #16 and #22 . With the new internal setting copy-batch-size it's now possible to instruct pgloader to close batches early (before copy-batch-rows limit) when crossing the byte count threshold. When set to 20 MB it allows the new test case (exhausted) to pass under SBCL and CCL, and there's no measurable cost when copy-batch-size is set to nil (its default value) in the testing done. This patch is published without any way to tune the values from the command language yet, that's the next step once its been proven effective.	2014-01-26 23:22:18 +01:00
Dimitri Fontaine	ceec4780f2	Improve log message pointing to the log file (use the true name).	2014-01-26 21:25:27 +01:00
Dimitri Fontaine	ca0d25d3b2	Provide a new log level, :data, activated when both --debug and --verbose are used.	2014-01-26 17:49:20 +01:00
Dimitri Fontaine	b60f40a5fa	Fix transform function date-with-no-separator.	2014-01-26 17:48:45 +01:00
Dimitri Fontaine	db947e1467	Rework reader and writer data exchange. With this patch, the whole data massaging and final formating into the PostgreSQL COPY TEXT format is done by the reader thread, which publishes a batch at a time in the communication channel: a lparallel.queue object. Before that, the raw vectors where pushed directly in the queue, offering more flexibility to adjust to the reader and writer IO rates and capabilities, but impeding the ability of the Garbage Collector: data still in the queue was not collected even if not needed anymore. The new model also uses less memory, and allows a better control over what amount of data stays in memory. The new concurrent-batches parameter should be key to being able to process huge rows. The intent is to offering a way for the users to tune concurrent-batches down to 1 for sources with massive per-row memory footprint. Even better would be to find a way to automatically adjust the setting without spending too much time counting the bytes we're batching. Preliminary tests show no sensible impact on performances from this patch, even some improvements in cases.	2014-01-25 23:54:49 +01:00
Dimitri Fontaine	41add15397	In passing indentation change only.	2014-01-25 23:41:10 +01:00
Dimitri Fontaine	8ac2cc4930	Skip empty lines when reading from files.	2014-01-24 15:11:15 +01:00
Dimitri Fontaine	e92f085b04	Convert --root-dir to its truename before processing it, and manage errors to do so.	2014-01-24 15:10:45 +01:00
Dimitri Fontaine	c50164e53d	Manage the whole class of "integrity errors" also when retrying a batch...	2014-01-24 15:10:03 +01:00
Dimitri Fontaine	69b550a46e	Make use of the new usage function...	2014-01-24 10:14:51 +01:00
Dimitri Fontaine	be4cc804c0	Show usage and help when the command line options are not recognized.	2014-01-24 09:22:02 +01:00
Dimitri Fontaine	e8fcb15c27	Fix another hasty commit erroneously containing a for-tests change.	2014-01-23 23:29:27 +01:00
Dimitri Fontaine	b374d4bc8b	The current retry method has no need for copy-batch-split.	2014-01-23 23:28:25 +01:00
Dimitri Fontaine	d132bafc07	Refrain from parsing a non-existing command file...	2014-01-23 23:17:34 +01:00
Dimitri Fontaine	3f61c66a79	Also handle extra columns in CSV parsing.	2014-01-23 15:15:42 +01:00
Dimitri Fontaine	516ef08c37	Allow loading ragged CSV files.	2014-01-23 15:07:05 +01:00

1 2 3 4 5

202 Commits