pgloader

mirror of https://github.com/dimitri/pgloader.git synced 2025-08-07 23:07:00 +02:00

Author	SHA1	Message	Date
Dimitri Fontaine	12e788094b	Improve sexp parser and standard symbols support. Also add split-sequence to the list of special cases that we can use and is not found in the pgloader.transforms package. Fixes #965.	2019-05-14 15:49:24 +02:00
Dimitri Fontaine	98b465fbef	Add the new DBF tests in the test suite. All with expected results so that we can track regressions there.	2019-05-11 22:13:18 +02:00
Dimitri Fontaine	dae5dec03c	Allow fields/columns projections when parsing header. When using a CSV header, we might find fields in a different order than the target table columns, and maybe not all of the fields are going to be read. Take account of the header we read rather than expecting the header to look like the target table definition. Fix #888.	2019-01-15 22:39:08 +01:00
Dimitri Fontaine	d356bd501b	Accept even more ragged date format input. When parsing a date string from a date format, accept that the ms or us part be completely missing, rather than just missing some digits. Fixed #828.	2018-09-10 19:37:36 +02:00
Dimitri Fontaine	b685c8801d	Improve guessing of CSV parameters. In this commit we fail the guess faster, allowing to test for a much larger sample. The sample is still hard-coded, but this time to 1000 lines. Also add a test case, see #618.	2017-08-24 13:30:14 +02:00
Dimitri Fontaine	40c1581794	Review transaction and error handling in COPY. The PostgreSQL COPY protocol requires an explicit initialization phase that may fail, and in this case the Postmodern driver transaction is already dead, so there's no way we can even send ABORT to it. Review the error handling of our copy-batch function to cope with that fact, and add some logging of non-retryable errors we may have. Also improve the thread error reporting when using a binary image from where it might be difficult to open an interactive debugger, while still having the full blown Common Lisp debugging experience for the project developers. Add a test case for a missing column as in issue #339. Fix #339, see #337.	2016-02-21 15:56:06 +01:00
Dimitri Fontaine	150d288d7a	Improve our regression testing facility. Next parallelism improvements will allow pgloader to use more than one COPY thread to load data, with the impact of changing the order of rows in the database. Rather than doing a copy out and `diff` of the data just loaded, load the reference data and do the diff in SQL: select * from loaded.data except select * from expected.data If such a query returns any row, we know we didn't load what was expected and the regression test is failing. This regression testing facility should also allow us to finally add support for multiple-table regression tests (sqlite, mysql, etc).	2015-11-17 17:03:08 +01:00
Dimitri Fontaine	598c860cf5	Improve user code parsing, fix #297 . To be able to use "t" (or "nil") as a column name, pgloader needs to be able to generate lisp code where those symbols are available. It's simple enough in that a Common Lisp package that doesn't :use :cl fullfills the condition, so intern user symbols in a specially crafted package that doesn't :use :cl. Now, we still need to be able to run transformation code that is using the :cl package symbols and the pgloader.transforms functions too. In this commit we introduce a heuristic to pick symbols either as functions from pgloader.transforms or anything else in pgloader.user-symbols. And so that user code may use NIL too, we provide an override mechanism to the intern-symbol heuristic and use it only when parsing user code, not when producing Common Lisp code from the parsed load command.	2015-09-21 13:23:21 +02:00
Dimitri Fontaine	78c6bf097a	Fix the build again. Once more did I change a test file data and forgot to commit the changes to the expected file of the regression test.	2015-09-12 00:40:15 +02:00
Dimitri Fontaine	3f539b7384	Travis: update expected output file. Forgot to update the expected output file in the previous commit, which Travis is rightfully complaining about...	2015-09-07 17:47:03 +02:00
Dimitri Fontaine	04ddf940d9	Left pad COPY octal chars with 0, fix #275 . The COPY TEXT format accepts non printable characters with an escaped sequence wherin pgloader can pass in the octal number for the character in its encoding. When doing that with small numbers like \6 and the non-printable character is then followed by other numbers, then it becomes e.g. \646 which might not be part of the target encoding... To fix, always left pad the character octal number with zeroes, so that we now send in \00646 which COPY knows how to read: the char at \006 then 4 then 6. Also copy the test case over to pgloader and run it in the test suite.	2015-08-20 18:17:18 +02:00
Dimitri Fontaine	833b41c23b	Fix the regression test expected values, see #266 .	2015-07-26 14:45:43 +02:00
Dimitri Fontaine	1c7de22096	Add test coverage for #80 .	2015-06-25 14:16:12 +02:00
Dimitri Fontaine	ba7b27b60a	Travis: actually push the right version of the expected file.	2015-05-22 12:41:29 +02:00
Dimitri Fontaine	bffec4cc63	Allow for more options in the CSV escape character, fix #38 . To allow for importing JSON one-liners as-is in the database it can be interesting to leverage the CSV parser in a compatible setup. That setup requires being able to use any separator character as the escape character.	2015-05-22 12:31:06 +02:00
Dimitri Fontaine	abbc105c41	Implement CSV headers support. Some CSV files are given with an header line containing the list of their column names, use that when given the option "csv header". Note that when both "skip header" and "csv header" options are used, pgloader first skip as many required lines and then uses the next one as the csv header. Because of temporary failure to install the `ronn` documentation tool, this patch only commits the changes to the source docs and omits to update the man page (pgloader.1). A following patch is intended to be pushed that fixed that. See #236 which is using shell tricks to retrieve the field list from the CSV file itself and motivated this patch to finally get written.	2015-05-21 12:55:23 +02:00
Mark Lee	dc04b40836	Accept periods in CSV field names Periods are allowed in PG column names as well.	2015-05-15 07:22:07 -07:00
Dimitri Fontaine	95a5eb3184	Implement more COPY options, fix #218 . The COPY format now supports user defined delimiter and null options, and we don't require the column names anymore as it's useless in that context.	2015-04-30 14:30:16 +02:00
Dimitri Fontaine	53dcdfd8ef	Fix handling of COPY data, fix #222 . When given a file in the COPY format, we should expect that its content is already properly escaped as expected by PostgreSQL. Rather than unescape the data then escape it again, add a new more of operation to format-vector-row in which it won't even try to reformat the data. In passing, fix an off-by-one bug in dealing with non-ascii characters.	2015-04-30 13:17:02 +02:00
Dimitri Fontaine	48f451bdbc	Implement the option to disable triggers when loading data. This option is dangerous and allows to skip ALL triggers when loading data against PostgreSQL. This includes foreign key constraints definitions and will allow loading data out of order. When using both the options "create no table" and "disable triggers" it will be possible to load data into a schema prepared by your favorite external tool, at the cost of not validating FK constraints. Use with care. Fix #167.	2015-02-19 15:05:10 +01:00
Dimitri Fontaine	00b002124b	Travis: switch test case to timestamp, dropping the TZ. The timezone is different in between my own machine and the test system, just get rid of that discrepancy so that the test stop failing.	2014-10-02 01:32:19 +02:00
Dimitri Fontaine	2f49c9614c	Fix the new test case's out file.	2014-10-02 01:16:58 +02:00
Dimitri Fontaine	7cf7e714fc	Implement the source date format option.	2014-10-02 01:03:24 +02:00
Dimitri Fontaine	ea97fc4659	Implement a new source level filter: trim. As seen in #116, it might be better for the users to be able to ask for field trimming right in the source definition, like we do for processing nulls.	2014-09-29 15:16:04 +02:00
Dimitri Fontaine	07b5aa3ed6	Add BEFORE/AFTER LOAD clauses to IXF and DBF commands.	2014-07-17 16:56:13 +02:00
Dimitri Fontaine	32b4cf23e8	New test case showing off the 'null if' source field option, see #80 .	2014-06-16 19:59:08 +02:00
Dimitri Fontaine	fe4f577300	Add the csv-filename-pattern test to the suite.	2014-06-16 14:17:53 +02:00
Dimitri Fontaine	65aabb8216	Add a dbf test to the regression suite.	2014-06-16 14:17:33 +02:00
Dimitri Fontaine	3bcd236de6	Add automated regression tests. Those tests currently only work when a single table is the target of the load, and when this target is explicit in the INTO target clause. More work needs to be done to cover interesting cases like MySQL and SQLite where we want to diff a full database rather than a single table.	2014-06-03 12:19:23 +02:00

29 Commits