Commit Graph

1568 Commits

Author SHA1 Message Date
Mikael Sand
3c4e64ed26 Fix spelling error in Windows-only code path (#545)
Fix spelling error for uiop:make-pathname* key by changing :direction to :directory
2017-04-30 19:53:53 +02:00
Dimitri Fontaine
8254d63453 Fix incorrect per-table total time metrics.
The concurrency nature of pgloader made it non obvious where to implement
the timers properly, and as a result the tracking of how long it took to
actually transfer the data was... just wrong.

Rather than trying to measure the time spent in any particular piece of the
code, we now emit "start" and "stop" stats messages to the monitor thread at
the right places (which are way easier to find, in the worker threads) and
have the monitor figure out how long it took really.

Fix #506.
2017-04-30 18:09:50 +02:00
Dimitri Fontaine
20ea1d78c4 Improve default summary readability.
Now that we have fixed the output of the per-table total timing, we can only
show that timing by default. With more verbosity pgloader will add the extra
columns, and in computer oriented formats (json, csv, copy) all the details
are always provided of course.

See #506.
2017-04-30 18:09:50 +02:00
geethaRam
0e12d77a7f Update pgloader.spec (#537)
Updated version in spec
2017-04-17 23:20:14 +02:00
Dimitri Fontaine
9b4bbdfef7 Review --load-lisp-file error handling.
The handler-case form installed would catch any non-fatal warning and would
also fail to display any error to the user. Both are wrong behavior that
this patch fixes, using *error-output* (that's stderr) explicitely for any
thing that may happen while loading the user provided code.

Fix #526.
2017-04-16 21:22:46 +02:00
Dimitri Fontaine
538464f078 Avoid operator is not unique errors.
When the intarray extension is installed our PostgreSQL catalog query fails
because we now have more than one operator solving smallint[] <@ smallint[].
It is easy to avoid that problem by casting to integer[], smallint being an
implementation detail here anyway.

Fix #532.
2017-04-06 23:55:06 +02:00
Dimitri Fontaine
0219f55071 Review DROP INDEX objects quoting.
Force double-quoting of objects name in DROP INDEX commands by using the
format directive ~s. The names of the objects we are dropping usually come
from a PostgreSQL catalog, but still might contain force-quote conditions
like starting with a number, as shown in #530.

This fix certainly means we will have to review all the DDL formatting we do
in pgloader and apply a single method of quoting all along. The simpler one
is of course to force quote every object name in "", but it might not be the
smartest one (what if some sources are sending already quoted object names,
that needs a check), and it's certainly not the prettier way to go at it:
people usually like to avoid unnecessary quotes, calling them clutter.

Fix #530.
2017-04-01 22:37:26 +02:00
Dimitri Fontaine
e2bc7e4fd4 Fix github language statistics.
As documented in https://github.com/github/linguist#overrides and from the
conversation in https://github.com/github/linguist/issues/3540 we add a
.gitattributes file to the project wherein we pretend that the sql files are
all vendored-in.

This should allow GitHub to realize that pgloader is all about Common Lisp
and not a single line of PLpgSQL...
2017-03-26 11:26:29 +02:00
Dimitri Fontaine
b2f9590f58 Add support for MS SQL XML data type.
Given a test case and some reading of the FreeTDS source code, it appears
that the XML data type is sent on the wire as (unicode) text. This patch
makes pgloader aware of that and also revisit the choice of casting XML to
PostgreSQL XML data type (thanks to the test case where we see it just works
without surprise).

Fix #503.
2017-03-25 21:26:16 +01:00
Dimitri Fontaine
296e571e27 Fix MS SQL tinyint identity support.
Add a cast rule to support tinyint being an “identity” type in MS SQL, which
means using a sequence to derive its values from. We didn't address the
whole MS SQL integer type tower here, and suspect we will have to add more
in the future.

Fix #528 where I could have access to a test case and easily reproduce the
bug, thanks!
2017-03-22 11:38:40 +01:00
Dimitri Fontaine
940fc63a5e Distribute *root-dir* to all threads.
The creation of the reject and data files didn't happen in the
right *root-dir* setting for lack of sending the main value to the worker
threads.
2017-03-18 18:51:22 +01:00
Dimitri Fontaine
ab7e77c2d0 Fix double transformation call in CSV projections.
In advanced projections it could be that we call the transformation function
for some input fields twice. This is a bug that manifest in particular when
the output of the transformation can't be used/parsed again by the same
function as shown in the bug reported.

Fix #523.
2017-03-04 15:55:08 +01:00
Dimitri Fontaine
3fac222432 Fix MSSQL index column names quoting.
We have to pay attention that column names in MS SQL don't follow the same
rules as in PostgreSQL and may e.g. begin with numbers. Apply identifier
case and rules to index column names too.
2017-03-03 21:30:58 +01:00
Dimitri Fontaine
1023577f50 Review internal database migration logic.
Many options are now available to pgloader users, including short cuts that
where not defined clearly enough. That could result in stupid things being
done at times.

In particular, when picking the "data only" option then indexes are not to
be dropped before loading the data, but pgloader would still try and create
them again at the end of the load, because the option that controls that
behavior default to true and is not impacted by the "data only" choice.

In this patch we review the logic and ensure it's applied in the same
fashion in the different phases of the database migration: preparation,
copying, rebuilding of indexes and completion of the database model.

See also 96b2af6b2a where we began fixing
oddities but didn't go far enough.
2017-02-26 14:48:36 +01:00
Dimitri Fontaine
8ec2ea04db Add support for MySQL geometry points.
The new version of the sakila database uses geometry typed columns that
contain POINT data. Add support for that kind of data by copying what we did
already for POINT datatype.
2017-02-25 21:52:41 +01:00
Dimitri Fontaine
9e2b95d9b7 Implement support for PostgreSQL storage parameters.
In PostgreSQL it is possible at CREATE TABLE time to set some extra storage
parameters, the most useful of them in the context of pgloader being the
FILLFACTOR. For the setting to be useful, it needs to be positionned at
CREATE TABLE time, before we load the data.

The BEFORE LOAD clause of the pgloader command allows to run SQL scripts
that will be executed before the load, and even before the creation of the
target schema when pgloader does that, which is nice for other use case.

Here we implement a new `ALTER TABLE` rule that one can set in the pgloader
command in order to change storage parameters at CREATE TABLE time:

  ALTER TABLE NAMES MATCHING ~/\./ SET (fillfactor='40')

Fix #516.
2017-02-25 21:49:06 +01:00
Dimitri Fontaine
57dd9fcf47 Add int as an alias for integer.
We cast MS SQL "int" type to "integer" in PostgreSQL, so add an entry in our
type name mapping where they are known equivalent to avoid WARNINGs about
the situation in DATA ONLY loads.
2017-02-25 17:54:57 +01:00
Dimitri Fontaine
5fd1e9f3aa Fix catalog merge hasards.
When reading table names from PostgreSQL, we might find some that need
systematic quoting (such as names that begin with a digit). In that case,
when later comparing the catalogs to match source database table names
against PostgreSQL catalog table names, we need to unquote the PostgreSQL
table name we are using.

In passing, force the *identifier-case* to :none when reading object names
from the PostgreSQL catalogs.
2017-02-25 17:53:08 +01:00
Dimitri Fontaine
96b2af6b2a Fix a hang scenario in schema-only.
The parallelism in pgloader is now smart enough to begin fetching data from
the next table while the previous one is still not done being written down
to PostgreSQL, but when doing so I introduced a bug in the way indexes are
taken care of.

Specifically, in schema-only mode of operations, we would wait for indexes
we skipped creating. The skipping is the bug here, so make sure we create
indexes even when we don't want to copy any data over.
2017-02-25 17:09:07 +01:00
Dimitri Fontaine
2f7169e286 Fix MS SQL N'' default values.
MS SQL apparently sends default values as Nvarchar, and in this case it
means we have to deal ourselves with the N'' representation of it.
2017-02-25 16:14:26 +01:00
Jan Moringen
57bc1ca886 Shadow symbols NAMESTRING, NUMBER and INLINE in pgloader.parser package (#515)
Defining rules on standard symbols like CL:NAMESTRING is a bad idea
since other systems may do the same, inadvertently overwriting each
other's rules.

Furthermore, future esrap versions will probably prevent defining
rules whose names are symbols in locked packages, making this change
mandatory.
2017-02-12 15:14:34 +01:00
Dimitri Fontaine
024579c60d Fix SBCL version requirements.
As we now depend on recent enough version of ASDF in some of our build
dependencies, that raises the bar to ABCL 1.2.5 or newer now.

Fixes #497.
2017-01-28 18:17:19 +01:00
Dimitri Fontaine
6bd17f45da Add support for MS SQL smalldatatime data type.
Availability of a test case for MS SQL allows to make progress on this
limitation and add support to the smalldatetime data type. It is
converted server-side with the same CONVERT expression as the longer
datetime datatype.

Fixes #431.
2017-01-28 18:08:55 +01:00
Dimitri Fontaine
a799cd5f5f Improve error handling for MS SQL.
In particular, implement more solid handling of poorly encoded data or
badly setup connections, by reporting the error and continuing the load.
2017-01-28 17:47:44 +01:00
Dimitri Fontaine
ddda2f92ca Force column ordering in SQLite support.
In the case of targetting an already existing PostgreSQL database,
columns might have been reordered. Add the column name list to the COPY
command we send so that we figure the mapping out automatically.

Fixes #509.
2017-01-28 17:45:33 +01:00
Dimitri Fontaine
b54ca576cb Raise some log messages.
We should be able to follow the progress more easily at the log level
NOTICE, so raise some log messages from INFO to NOTICE.
2017-01-28 17:44:18 +01:00
Dimitri Fontaine
ed217b7b28 Add some docs about FreeTDS and encoding.
It turns out that it's possible and not too complex, when using the
FreeTDS driver, to enforce the client encoding for MS SQL to be utf-8.
Document how to tweak ~/.freetds.conf to that end.
2017-01-27 22:16:59 +01:00
Dimitri Fontaine
1d025bcd5a Fix log levels.
It looks like we missed the INFO level messages in the DEBUG output.
2017-01-23 21:52:38 +01:00
Dimitri Fontaine
bd84c6fec9 Fix default value handling in MS SQL.
When the column is known to be non-nullable, refrain from adding a null
default value to it. This also fixes the case of casting from an

  [int] IDENTITY(1,1) NOT NULL

That otherwise did get transformed into a

  bigserial not null default NULL

Causing then the error

  Database error 42601: multiple default values specified for column ... of table ...
2017-01-23 21:50:16 +01:00
Dimitri Fontaine
c0f9569ddd In passing aesthetic concerns. 2017-01-23 21:49:42 +01:00
Dimitri Fontaine
1d35290914 Assorted fixes for MS SQL support.
When updating the catalog support we forgot to fix the references to the
index and fkey name slots that are now provided centrally for the
catalog of all database source types.

Again, we don't have unit test cases for MS SQL, so that's a blind
fix (but at least that compiles).

See #343.
2017-01-10 21:19:29 +01:00
Dimitri Fontaine
dbf7d6e48f Don't double-quote identifiers in catalog queries.
Avoid double quoting the schema names when used in PostgreSQL catalog
queries, where the identifiers are used as literal values and need to be
single-quoted.

Fix #476, again.
2017-01-10 21:12:34 +01:00
Dimitri Fontaine
8da09d7bed Log PostgreSQL Catalog queries at SQL log level.
See #476 where it would have been helpful to see the PostgreSQL catalog
queries with `--log-min-messages sql` in the bug report. Also more
generally useful.
2017-01-10 21:12:34 +01:00
Dimitri Fontaine
17536e84a4 Create CNAME 2017-01-06 21:37:59 +01:00
Dimitri Fontaine
effa916b31 Improve parallelism setup documentation.
The code comment displayed in the release notes for 3.3.1 is reported to
be better at explaining the concurrency control than what we had in the
main documentation, so add it there.

Fix #496.
2017-01-03 23:13:01 +01:00
Dimitri Fontaine
21a10235db Refrain from issuing the summary twice.
Now that we have a proper flush system for reporting the summary at the
proper time (see 7c5396f097), refrain from
also taking care of the reporting when stopping the monitor.

Adapt the regression driver code to flush the summary after loading the
expected data, which also provides better output.

When the summary output is sent to a file, that would also create a
backup file and replace our summary with an empty new file at monitor
stop...

Fixes #499.
2017-01-03 23:07:58 +01:00
Dimitri Fontaine
b239e6b556 Fix #498. 2017-01-03 22:28:58 +01:00
Dimitri Fontaine
381ba18b50 Add a new log level: SQL.
This sits between NOTICE and INFO, allowing to have a complete log of
the SQL queries sent to the server while avoiding the very verbose
trafic of the DEBUG log level.

See #498.
2017-01-03 22:27:17 +01:00
Dimitri Fontaine
4931604361 Allow ALTER SCHEMA command for MySQL.
This pgloader command allows to migrate tables while changing the schema
they are found into in between their MySQL source database and their
PostgreSQL target database.

This changes the default behavior of pgloader with MySQL from always
targetting the 'public' schema to targetting by default a schema named
the same as the MySQL database. You can revert to the old behavior by
adding a rule:

   ALTER SCHEMA 'dbname' RENAME TO 'public

We might want to add a patch to re-install the default behavior later.

Also see #489 where it used not to be possible to rename the schema at
migration time, causing strange errors (you need to spot NIL as the
schema name in the "failed to find target table" messages.
2016-12-18 19:31:21 +01:00
Dimitri Fontaine
bdaacae3e7 Fix Primary Keys count.
That was broken in a recent patch refactoring the PostgreSQL SQL execute
API that now accepts a list of commands to execute.
2016-12-18 19:31:21 +01:00
Dimitri Fontaine
320a545533 Fix SQL types creation: consider views too.
When migrating views from e.g. MySQL it is necessary to consider the
user defined SQL types (ENUMs) those views might be using.
2016-12-18 19:31:21 +01:00
Dimitri Fontaine
ad56cf808b Fix PostgreSQL index naming.
A PostgreSQL index is always created in the same schema as the table it
is defined against, and the CREATE INDEX command doesn't accept schema
qualified index names.
2016-12-18 19:31:21 +01:00
Andy Freeland
9a0c50f700 Make sure EPEL is enabled when installing SBCL (#494) 2016-12-17 16:30:57 +01:00
Dimitri Fontaine
1c927beb81 Fix cl-postgres packaging (typo). 2016-12-12 12:04:40 +01:00
Dimitri Fontaine
37fc4ba550 Back to development mode. 2016-12-04 14:09:36 +01:00
Dimitri Fontaine
ac202dc70e Prepare release 3.3.2. 2016-12-03 17:38:52 +01:00
Dimitri Fontaine
db9fa2f001 Improve docs for connection strings.
Some parts of the connection strings might be provided from the
environment, such as in the MySQL case. Fix #485.
2016-12-03 15:51:39 +01:00
Dimitri Fontaine
6eef0c6c00 Improve docs with default parallelism settings.
Fix #442 by adding the default values of concurrency and workers.
2016-12-03 15:30:34 +01:00
Dimitri Fontaine
7c5396f097 Review fatal errors handling.
Make it so that fatal errors are printed only once, and when possible
included in the usual log format as handled by our monitoring thread.
Also, improve error and summary reporting when we load from several
sources on the same command line.

All this work has been triggered because of an edge case where the OS
return value of the pgloader command was 0 (zero, success) although the
given file on the command line does not exists.

Fixes #486.
2016-11-27 23:58:50 +01:00
Dimitri Fontaine
2dc733c4d6 Fix corner case in creating indexes again.
When the option "drop indexes" is in use in loading data from a file, we
collect the indexes from the PostgreSQL catalogs and then issue DROP
commands against them before the load, then CREATE commands when it's
done.

The CREATE is done in parallel, and we create an lparallel kernel for
that. The kernel must have a worker-count of at least 1, and we where
not considering the case of 0 indexes on the target table.

Fix #484.
2016-11-20 17:17:15 +01:00