pgloader

mirror of https://github.com/dimitri/pgloader.git synced 2025-08-10 08:17:00 +02:00

Author	SHA1	Message	Date
Dimitri Fontaine	976e4c1c1d	Fix SQLite processing of columns with a sequence attached. The handling of the SQLite catalogs where fixed in a previous patch, but either it's been broken in between or it never actually worked (oops). Moreover, the recent patch about :on-update-current-timestamp changed the casting rules matching code and we should position :auto-increment from the SQLite module rather than "auto_increment" as before. That's better, but wasn't done. Fix #563 again, tested with a provided test-case (thanks!).	2018-01-31 22:49:10 +01:00
Dimitri Fontaine	4612e68435	Implement support for new casting rules guards and actions. Namely the actions are “keep extra” and “drop extra” and the casting rule guard is “with extra on update current timestamp”. Having support for those elements in the casting rules allow such a definition as the following: type timestamp with extra on update current timestamp to "timestamp with time zone" drop extra The effect of such as cast rule would be to ignore the MySQL extra definition and then refrain pgloader from creating the PostgreSQL triggers that implement the same behavior. Fix #735.	2018-01-31 15:17:05 +01:00
Dimitri Fontaine	5ecd03ceba	Don't push-row a nil value. In case of a failure to pre-process or transform values in the row that as been read, we need to refrain from pushing the row into our next batch. See #726, that got hit by the recent bug in the middle of something else entirely.	2018-01-25 23:53:11 +01:00
Dimitri Fontaine	25152f6054	Add a restart-case for interactive debugging. When dealing with MATERIALIZING VIEWS test cases and failing in the middle of them, as it happens when fixing bugs, then it was tedious (to say the least) to clean-up manually the view each time. That said, for end-users, doing it automatically would risk cleaning-up the wrong view definition if they had a typo in their pgloader command, say. Common Lisp helps a lot here: we simply create a restart that is only available interactively for the developers of pgloader!	2018-01-25 23:38:59 +01:00
Dimitri Fontaine	7b08b6e3d3	Refrain from creating tables in “data only” operations. We forgot that rule in the case of creating the target tables for the materializing views commands, which led to surprising and wrong behavior. Fix #721, and add a new test case while at it.	2018-01-25 23:32:31 +01:00
Dimitri Fontaine	5ba42edb0c	Review misleading error message with schema not found. It might be that the schema exists but we didn't find what we expected to in there, so that it didn't make it to pgloader's internal catalogs. Be friendly to the user with a better error message. Fix #713.	2018-01-25 23:29:36 +01:00
Dimitri Fontaine	a603cd8882	Step back on (safety 0) optimization. It doesn't appear worth it at this time yet, too risky.	2018-01-24 23:26:37 +01:00
Dimitri Fontaine	6ae3bd1862	Docs cleanup. Don't maintain generated files in git, it's useless (thanks mainly to readthedocs), also remove the previous format of the docs.	2018-01-24 22:47:37 +01:00
Dimitri Fontaine	f86371970f	Review the pgloader COPY implementation further. Refactor file organisation further to allow for adding a “direct stream” option when the on-error-stop behavior has been selected. This happens currently by default for databases sources. Introduce the new WITH option “on error resume next” which forces the classic behavior of pgloader. The option “on error stop” already existed, its implementation is new. When this new behavior is activated, the data is sent to PostgreSQL directly, without intermediate batches being built. It means that the whole operation fails at the first error, and we don't have any information in memory to try replaying any COPY of the data. It's gone. This behavior should be fine for database migrations as you don't usually want to fix the data manually in intermediate files, you want to fix the problem at the source database and do the whole dance all-over again, up until your casting rules are perfect. This patch might also incurr some performance benenits in terms of both timing and memory usage, though the local testing didn't show much of anything for the moment.	2018-01-24 22:45:23 +01:00
Dimitri Fontaine	8ee799070a	Simplify format-vector-row a lot. Copy some code over from cl-postgres-trivial-utf-8 and add the support for PostgreSQL COPY escaping right at the same place, allowing to allocate our formatted utf-8 buffer only once, with the escaping already installed. This patch was expected to be more about perfs, but it's actually only about code cleaning it seems, as it doesn't make a big difference in the testing I could do here. That said, getting rid of one intermediate buffer should be nice in terms of memory management.	2018-01-24 00:10:40 +01:00
Dimitri Fontaine	adf03c47ad	Clean up source code organisation. The copy format and batch facilities are no longer the meat of your PostgreSQL support in the src/pgsql directory, so have them leave in their own space.	2018-01-23 19:52:13 +01:00
Dimitri Fontaine	3bb128c5db	Review format-vector-row. This function prepares the data to be sent down to PostgreSQL as a clean COPY text with unicode handled correctly. This commit is mainly a clean-up of the function, and also adds some smarts to try and make it faster. In testing, the function is now tangentially faster than before, but not by much. The hope here is that it's now easier to optimize it.	2018-01-22 21:37:14 +01:00
Dimitri Fontaine	ba2d8669c3	Add support for the newer Qmynd error handling. We now have a qmynd-impl::decoding-error condition to deal with, which as a very good error reporting, so that we don't need to poke into babel details anymore. The error message adds the column name, type and collation to the output, too. We keep the babel handlers for a while until people have all migrated to using the patch in qmynd. With the fix to Qmynd, Fix #716.	2018-01-22 16:14:05 +01:00
Dimitri Fontaine	572f6a3dbe	Fix CSV separator parsing. The previous patch introduced parser conflicts and we couldn't parse some expressions any more, such as the following: fields escaped by '\', It's now possible to represent single quote as either '''', '\'', or '0x27' and we still can parse '\' as being a single backslash character. See #705.	2018-01-14 15:33:47 +01:00
Julien Danjou	bb6c3d0a32	doc: fix a few link format (#711 ) They are still in Markdown format, remove or move to rst.	2018-01-09 19:22:21 +01:00
Olivier Macchioni	b683292784	Fix broken link to https://pgloader.io/ (#706 )	2017-12-28 18:59:50 +01:00
Dimitri Fontaine	81be9ae60e	Implement support for \' as the CSV separator. The option "fields optionally enclosed by" was missing a way to easily specify a single quote as the quoting character. Add '\'' to the existing solution '0x27' which isn't as friendly. See #705.	2017-12-26 21:04:06 +01:00
Dimitri Fontaine	07cdf3e7e5	Use MySQL column names in MySQL queries. The query for concurrency-support didn't get the memo that we should ignore PostgreSQL identifier-case when querying the source MySQL database. Fix the query string to include column names as given by the MySQL catalogs. In bug report #703, the problem is found in PostgreSQL queries. This has been fixed before already. Trying to reproduce the bug produced an error in the concurrency-support query instead, so let's fix this one. Fix #703.	2017-12-22 14:15:46 +01:00
Dimitri Fontaine	25c79dfebc	Switch the documentation to using Sphinx. The website is moving to pgloader.org and readthedocs.io is going to be integrated. Let's see what happens. The docs build fine locally with the sphinx tools and the docs/Makefile. Having separate files for the documentation should help ease the maintenance and add new topics, such as support for Common Lisp Hackers level docs, which are currently missing.	2017-12-21 17:45:09 +01:00
Dimitri Fontaine	21f8baabab	Update CNAME	2017-12-21 17:21:19 +01:00
Dimitri Fontaine	62b45e4d16	Fix log type output for summary files. A summary file could be asked for which is not of either csv, json or copy format. Then use the text format. Fix #695.	2017-12-06 20:57:19 +01:00
Dimitri Fontaine	b7d87a9eb1	Fix MySQL bit(1) casting function. When this function was written, pgloader would get an array of numbers over the wire, nowadays it looks like it's receiving an array of characters instead (in other words, a string). Improve the `bits-to-boolean` function to accept either input, and raise an error in another case. My theory is that something changed either in MySQL (with version 10) or in the Qmynd driver somehow... but tonight we just go easy and fix the bug locally rather than try and understand where it might be coming from. Fixes #684.	2017-12-03 23:06:54 +01:00
Dimitri Fontaine	c05183fcba	Implement support for Foreign Tables and Partitionned Tables. Due to the way pgloader queries the PostgreSQL catalogs, it restricted the target table to be “ordinary” tables, as per the relkind description in the https://www.postgresql.org/docs/current/static/catalog-pg-class.html PostgreSQL documentation. Extend this to support relkind of 'r', 'f' and 'p'. Fixes #587, fixes #690.	2017-12-01 22:13:47 +01:00
Dimitri Fontaine	52f13456d9	Rewrite the SQLite type name parsing. SQLite being very very liberal in type names (I think it accepts anything and everything actually), our simple approach of tokenizing the input and discarding noise words is not enough. In this patch, we implement a new light parser for the SQLite type names to better cope with noise words and random spacing of the catalog values that SQLite failed to normalize. Well it didn't attempt, apparently. Fix #548.	2017-11-28 18:19:12 +01:00
Dimitri Fontaine	2b861a3e96	New SQLite test cases.	2017-11-25 16:31:42 -08:00
Dimitri Fontaine	87f35e8852	Refrain from loading incomplete foreign key references in SQLite. Given INCLUDING and EXCLUDING support it might be possible that we migrate a table from SQLite without having selecting tables pointed to by foreign keys. In that case, pgloader should still be able to load the data definition and content fine, just skipping the incomplete fkey definitions. That's implemented in this patch, which has been tested thanks to a reproducible data set being made available! Fixes #681.	2017-11-25 16:31:41 -08:00
Olleg Samoylov	62d776f5e8	Uppercase the SQL queries for MS SQL In cases when the MS SQL database is setup with a case sensitive collation, then it would not find the catalog objects referenced from the query. To fix, just use UPPERCASE names, as they do work in both case insensitive and case sensitive collations. In passing, add `system-index.txt` to `.gitignore` (generated by make). Fixes #651.	2017-11-25 02:23:25 +01:00
Dimitri Fontaine	d69b72053a	Implement default unsigned casting rules for MySQL. The following casting rules are now the default for MySQL: - type tinyint when unsigned to smallint drop typemod - type smallint when unsigned to integer drop typemod - type mediumint when unsigned to integer drop typemod - type integer when unsigned to bigint drop typemod Fixes #678.	2017-11-22 10:29:11 -08:00
Dimitri Fontaine	5c60f8c35c	Implement a new type casting guard: unsigned. MySQL allows using unsigned data types and pgloader should then target a signed type of a larger capacity so that values can fit. For example, the data definition “smallint(5) unsigned” should be casted to “integer”. This patch allows user defined cast rules to be written against “unsigned” data types as per their MySQL catalog representation. See #678.	2017-11-22 10:26:03 -08:00
Dimitri Fontaine	6964764fb4	Find schema names unquoted. When doing a MySQL to PostgreSQL migration in data only mode, pgloader matches schema names found on both source and target database, and much like with table names must do so ensuring unquoted schema names. Otherwise we fail to find the schema name again, because one spelling has the quotes, but not the other one, when using the “quote identifiers” option. Fix #659, at least some forms of it.	2017-11-19 17:12:21 +01:00
Dimitri Fontaine	1d7706c045	Fix the MySQL encoding error handling. The error handling would try and read past the error buffer in some cases, when the BABEL lib would give a position that's after the buffer read. Fix #661.	2017-11-13 11:27:47 +01:00
Christoph Berg	5c4a64197d	Run regression tests via autopkgtest	2017-11-12 21:05:48 +01:00
Christoph Berg	78df9c314a	Sync Depends to cl-pgloader. * Sync Depends to cl-pgloader. * Priority: optional, move cl-pgloader to Section: lisp. * Update S-V.	2017-11-11 17:14:18 +01:00
Christoph Berg	3002f4d30e	Add new B-D cl-mustache and cl-yason.	2017-11-11 16:47:08 +01:00
Christoph Berg	28ea825d85	Run wrap-and-sort -st.	2017-11-11 16:39:19 +01:00
Dimitri Fontaine	db7a91d6c4	Add the MySQL target schema to the search_path. In the next release, pgloader defaults to targetting a new schema named the same as the MySQL database, because that's what makes more sense. But people are used to having 'public' in the search_path and everything in there. So when creating our target schema, when migrating from MySQL, arrange it so that the new schema is in the search_path by issuing a command like: ALTER DATABASE plop SET search_path TO public, f1db; And make this command visible in verbose (NOTICE) mode too, so that user can see what happens. Fix #654. I think.	2017-11-02 12:40:21 +01:00
Dimitri Fontaine	6b6c1c7d34	Add log entries for connection strings. It helps a lot to debug what's happening, and it seems that we lost the information when cleaning up the log levels in recent efforts to unclutter the default output.	2017-11-02 12:38:45 +01:00
Dimitri Fontaine	501762d2f5	Update the website with the new Gumroad id.	2017-11-01 16:38:54 +01:00
Dimitri Fontaine	dd401c57f3	Fix a latent bug discovered in local testing with CCL. It turns out that when using print-pretty in CCL we then have CL reader references in the output, such as in the following example: QUERY: comment on table mysql.base64 is $#1=DXIDC_EMLAQ$Test decoding base64 documents$#1#$ Of course that's wrong, so prevent this from happening by forcing print-pretty to nil in a top-level function. We still turn this on in the monitor thread when printing error messages as those might contain recursive data structures.	2017-10-21 21:06:35 +02:00
Dimitri Fontaine	0a88645eb5	Fix time measurements of the write activity. When using --verbose or more detailed log messages, the summary prints timings for both read and write operations separately. The write summary timing took into account only the PostgreSQL batch activity, discarding the formatting of the data done by pgloader. As this formatting is quite heavy at the moment, the results are pretty misleading without that information.	2017-10-21 21:04:55 +02:00
Dimitri Fontaine	a9afddf8ed	Accept quoted namestrings as target type names for cast rules. This allows passing "double precision" rather than float8, for example. Fix #650.	2017-10-21 21:03:58 +02:00
Dimitri Fontaine	a28e9b3556	Prevent evaluating unused arguments in log-message. A stop-gap has been installed to prevent sending too much trafic to the monitor, but the log-message arguments were still evaluated, and the :data level output from format-row-in-batch is pretty costly.	2017-10-16 17:26:07 +02:00
Dimitri Fontaine	b36f36b74e	Add a (local) test case.	2017-10-16 17:25:44 +02:00
Dimitri Fontaine	9b80d2914c	List files to load for system. Install a new function in the hooks file. This function might help fix --self-upgrade later, we keep it around for when we'll have time to see about that.	2017-10-16 17:24:47 +02:00
Dimitri Fontaine	52720a5e6f	Prefer QL overrides to ASDF setup. The ql:local-project-directories is a much better facility for us to load pgloader from the local PWD rather than from the QL distribution. It looks like the previous method worked by accident, for once, and also downloaded pgloader from QL, unnecessarily (we have the sources locally).	2017-10-03 13:47:48 +02:00
Dimitri Fontaine	5b227200a9	Fix error handling at monitor thread startup. Errors such as failing to open the log file (maybe because of bad permissions) weren't correctly handled. This fixes the problem by handling the conditions at the lparallel task handler level and signaling a brand new condition up to the main outside handler. Fixes #638.	2017-10-03 01:23:59 +02:00
Dimitri Fontaine	2595ddaae3	Fix total-line in reporting. We did it correctly for the bytes, and we need to apply the same logic to other metric: the relevant information in the total summary line is the sum from the data parts, not the sum from the postload parts.	2017-09-19 12:28:24 +02:00
Dimitri Fontaine	460fe6cc77	Fix quoting of default values for MariaDB 10 support. The default values quoting changed in MariaDB 10, and we need to adjust in pgloader: extra '' chars could defeat the default matching logic: "'0000-00-00'" is different from "0000-00-00"	2017-09-19 11:29:53 +02:00
Dan	62991bd5c5	Add missing column to GROUP BY. (#633 )	2017-09-16 21:15:11 +02:00
Dimitri Fontaine	8a361a0ff8	Add support for multiple on update columns per table. The MySQL special syntax "on update current_timestamp()" used to support only a single column per table (in MySQL), and so did pgloader. In MariaDB version 10 it's now possible to have several column with that special treatment, so adapt pgloader to migrate that too. What pgloader does is recognize that several columns are to receive the same pre-update processing, and creates a single function that does the both of them, as in the following example, from pgloader logs in a test case: CREATE OR REPLACE FUNCTION mysql.on_update_current_timestamp_onupdate() RETURNS trigger LANGUAGE plpgsql AS $$ BEGIN NEW.update_date = now(); NEW.calc_date = now(); RETURN NEW; END; $$; CREATE TRIGGER on_update_current_timestamp BEFORE UPDATE ON mysql.onupdate FOR EACH ROW EXECUTE PROCEDURE mysql.on_update_current_timestamp_onupdate(); Fixes #629.	2017-09-15 01:04:57 +02:00

... 2 3 4 5 6 ...

1488 Commits