Merge pull request #201 from jdufresne/fix-typo

Fix typo
This commit is contained in:
Dimitri Fontaine 2015-04-02 20:37:54 +02:00
commit 77394bd029

View File

@ -16,7 +16,7 @@ the data into the server, and manages errors by filling a pair of
pgloader operates either using commands which are read from files:
pgloader commands.load
or by using arguments and options all provided on the command line:
pgloader SOURCE TARGET
@ -32,7 +32,7 @@ The source connection string format is as follows:
format:///absolute/path/to/file.ext
format://./relative/path/to/file.ext
Where format might be one of `csv`, `fixed`, `copy`, `dbf`, `db3` or `ixf`.
db://user:pass@host:port/dbname
@ -108,7 +108,7 @@ Those options are meant to tweak `pgloader` behavior when loading data.
* `-S`, `--summary`:
A filename where to copy the summary output. When relative, the filename
is expanded into `*root-dir*`.
The format of the filename defaults to being *human readable*. It is
possible to have the output in machine friendly formats such as *CSV*,
*COPY* (PostgreSQL's own COPY format) or *JSON* by specifying a filename
@ -136,47 +136,47 @@ parser. In simple cases, it can be much easier to use the *SOURCE* and
options:
* `--with "option"`:
Allows setting options from the command line. You can use that option as
many times as you want. The option arguments must follow the *WITH*
clause for the source type of the `SOURCE` specification, as described
later in this document.
* `--set "guc_name='value'"`
Allows setting PostgreSQL configuration from the command line. Note that
the option parsing is the same as when used from the *SET* command
clause, in particular you must enclose the guc value with single-quotes.
* `--field "..."`
Allows setting a source field definition. Fields are accumulated in the
order given on the command line. It's possible to either use a `--field`
option per field in the source file, or to separate field definitions by
a comma, as you would do in the *HAVING FIELDS* clause.
* `--cast "..."`
Allows setting a specific casting rule for loading the data.
* `--type csv|fixed|db3|ixf|sqlite|mysql|mssql`
Allows forcing the source type, in case when the *SOURCE* parsing isn't
satisfying.
* `--encoding <encoding>`
Set the encoding of the source file to load data from.
* `--before <filename>`
Parse given filename for SQL queries and run them against the target
database before loading the data from the source. The queries are parsed
by pgloader itself: they need to be terminated by a semi-colon (;) and
the file may include `\i` or `\ir` commands to *include* another file.
* `--after <filename>`
Parse given filename for SQL queries and run them against the target
database after having loaded the data from the source. The queries are
parsed in the same way as with the `--before` option, see above.
@ -201,7 +201,7 @@ Review the command line options and pgloader's version:
Use the command file as the pgloader command argument, pgloader will parse
that file and execute the commands found in it:
pgloader --verbose ./test/csv-districts.load
pgloader --verbose ./test/csv-districts.load
### CSV
@ -395,18 +395,18 @@ pgloader supports the following input formats:
- csv, which includes also tsv and other common variants where you can
change the *separator* and the *quoting* rules and how to *escape* the
*quotes* themselves;
- fixed columns file, where pgloader is flexible enough to accomodate with
source files missing columns (*ragged fixed length column files* do
exist);
- PostgreSLQ COPY formatted files, following the COPY TEXT documentation
of PostgreSQL, such as the reject files prepared by pgloader;
- dbase files known as db3 or dbf file;
- ixf formated files, ixf being a binary storage format from IBM;
- sqlite databases with fully automated discovery of the schema and
advanced cast rules;
@ -436,7 +436,7 @@ options.
[ WITH <load-options> ]
[ SET <postgresql-settings> ]
[ BEFORE LOAD DO|EXECUTE [ <sql statements> | <sql file> ]
[ AFTER LOAD DO|EXECUTE [ <sql statements> | <sql file> ]
;
@ -450,13 +450,13 @@ specific clauses such as the `CAST` clause.
Some clauses are common to all commands:
- *FROM*
The *FROM* clause specifies where to read the data from, and each
command introduces its own variant of sources. For instance, the *CSV*
source supports `inline`, `stdin`, a filename, a quoted filename, and a
*FILENAME MATCHING* clause (see above); whereas the *MySQL* source only
supports a MySQL database URI specification.
In all cases, the *FROM* clause is able to read its value from an
environment variable when using the form `GETENV 'varname'`.
@ -465,7 +465,7 @@ Some clauses are common to all commands:
The PostgreSQL connection URI must contains the name of the target table
where to load the data into. That table must have already been created
in PostgreSQL, and the name might be schema qualified.
The *INTO* target database connection URI can be parsed from the value
of an environment variable when using the form `GETENV 'varname'`.
@ -480,21 +480,21 @@ Some clauses are common to all commands:
dynamically with the `--load` command line parameter.
Each *USING* expression is compiled at runtime to native code.
This feature allows pgloader to load any number of fields in a CSV file
into a possibly different number of columns in the database, using
custom code for that projection.
- *WITH*
Set of options to apply to the command, using a global syntax of either:
- *key = value*
- *use option*
- *do not use option*
See each specific command for details.
- *SET*
This clause allows to specify session parameters to be set for all the
@ -548,14 +548,14 @@ Where:
Can contain any character, including colon (`:`) which must then be
doubled (`::`) and at-sign (`@`) which must then be doubled (`@@`).
When omitted, the *user* name defaults to the value of the `PGUSER`
environment variable, and if it is unset, the value of the `USER`
environment variable.
- *password*
Can contain any character, including that at sign (`@`) which must then
Can contain any character, including the at sign (`@`) which must then
be doubled (`@@`). To leave the password empty, when the *user* name
ends with at at sign, you then have to use the syntax user:@.
@ -591,17 +591,17 @@ Where:
determined above.
- *options*
The optional parameters must be supplied with the form `name=value`, and
you may use several parameters by separating them away using an
ampersand (`&`) character.
Only two options are supported here, *tablename* (which might be
qualified with a schema name) and *sslmode*.
The *sslmode* parameter values can be one of `disable`, `allow`,
`prefer` or `require`.
For backward compatibility reasons, it's possible to specify the
*tablename* option directly, without spelling out the `tablename=`
parts.
@ -660,31 +660,31 @@ data source kind, such as the CSV *skip header* option.
The global batch behaviour options are:
- *batch rows*
Takes a numeric value as argument, used as the maximum number of rows
allowed in a batch. The default is `25 000` and can be changed to try
having better performances characteristics or to control pgloader memory
usage;
- *batch size*
Takes a memory unit as argument, such as *20 MB*, its default value.
Accepted multipliers are *kB*, *MB*, *GB*, *TB* and *PB*. The case is
important so as not to be confused about bits versus bytes, we're only
talking bytes here.
- *batch concurrency*
Takes a numeric value as argument, defaults to `10`. That's the number
of batches that pgloader is allows to build in memory, even when only a
single batch at a time might be sent to PostgreSQL.
Supporting more than a single batch being sent at a time is on the TODO
list of pgloader, but is not implemented yet. This option is about
controlling the memory needs of pgloader as a trade-off to the
performances characteristics, and not about parallel activity of
pgloader.
Other options are specific to each input source, please refer to specific
parts of the documentation for their listing and covering.
@ -751,7 +751,7 @@ The `csv` format command accepts the following clauses and options:
for exact syntax, several options can be used here) to filenames.
It's then possible to load data from only the first match of all of
them.
The optional *IN DIRECTORY* clause allows specifying which directory
to walk for finding the data files, and can be either relative to
where the command file is read from, or absolute. The given
@ -779,7 +779,7 @@ The `csv` format command accepts the following clauses and options:
Date format string are template strings modeled against the
PostgreSQL `to_char` template strings support, limited to the
following patterns:
- YYYY, YYY, YY for the year part
- MM for the numeric month part
- DD for the numeric day part
@ -791,9 +791,9 @@ The `csv` format command accepts the following clauses and options:
- MS for the milliseconds part (4 digits)
- US for the microseconds part (6 digits)
- unparsed punctuation signs: - . * # @ T / \ and space
Here's an example of a *date format* specification:
column-name [date format 'YYYY-MM-DD HH24-MI-SS.US']
- *null if*
@ -994,7 +994,7 @@ The `fixed` format command accepts the following clauses and options:
Date format string are template strings modeled against the
PostgreSQL `to_char` template strings support, limited to the
following patterns:
- YYYY, YYY, YY for the year part
- MM for the numeric month part
- DD for the numeric day part
@ -1006,9 +1006,9 @@ The `fixed` format command accepts the following clauses and options:
- MS for the milliseconds part (4 digits)
- US for the microseconds part (6 digits)
- unparsed punctuation signs: - . * # @ T / \ and space
Here's an example of a *date format* specification:
column-name [date format 'YYYY-MM-DD HH24-MI-SS.US']
- *null if*
@ -1068,13 +1068,13 @@ data as described in the PostgreSQL documentation. Here's an example:
milliseconds, bytes, unitprice
)
INTO postgresql:///pgloader?track_full
WITH truncate
SET client_encoding to 'latin1',
work_mem to '14MB',
standard_conforming_strings to 'on'
BEFORE LOAD DO
$$ drop table if exists track_full; $$,
$$ create table track_full (
@ -1100,9 +1100,9 @@ The `COPY` format command accepts the following clauses and options:
- *WITH*
When loading from a `COPY` file, the following options are supported:
- *truncate*
When this option is listed, pgloader issues a `TRUNCATE` command
@ -1185,7 +1185,7 @@ an example:
FROM data/nsitra.test1.ixf
INTO postgresql:///pgloader?nsitra.test1
WITH truncate, create table
BEFORE LOAD DO
$$ create schema if not exists nsitra; $$,
$$ drop table if exists nsitra.test1; $$;
@ -1467,7 +1467,7 @@ The `database` command accepts the following clauses and options:
In somes cases like when the DDL are entirely left to a framework it
might be sensible for pgloader to refrain from handling index unique
names, that is achieved by using the *preserve index names* option.
The default is to *uniquify index names*.
- *foreign keys*
@ -1650,7 +1650,7 @@ The `database` command accepts the following clauses and options:
be dropped at the end of the data loading.
- *MATERIALIZE ALL VIEWS*
Same behaviour as *MATERIALIZE VIEWS* using the dynamic list of views as
returned by MySQL rather than asking the user to specify the list.
@ -1905,9 +1905,9 @@ The `sqlite` command accepts the following clauses and options:
When this option is listed pgloader only issues the `COPY`
statements, without doing any other processing.
- *encoding*
This option allows to control which encoding to parse the SQLite
text data with. Defaults to UTF-8.
@ -1982,7 +1982,7 @@ Here's an example:
into postgresql:///dbname
including only table names like 'GlobalAccount' in schema 'dbo'
set work_mem to '16MB', maintenance_work_mem to '512 MB'
before load do $$ drop schema if exists dbo cascade; $$;
@ -2094,7 +2094,7 @@ The provided transformation functions are:
In: "20041002152952"
Out: "2004-10-02 15:29:52"
- *time-with-no-separator*
Transform the given time into a format that PostgreSQL will actually
@ -2102,7 +2102,7 @@ The provided transformation functions are:
In: "08231560"
Out: "08:23:15.60"
- *tinyint-to-boolean*
As MySQL lacks a proper boolean type, *tinyint* is often used to
@ -2167,19 +2167,19 @@ The provided transformation functions are:
http://www.postgresql.org/docs/9.3/interactive/datatype-binary.html
- *sqlite-timestamp-to-timestamp*
SQLite type system is quite interesting, so cope with it here to produce
timestamp literals as expected by PostgreSQL. That covers year only on 4
digits, 0 dates to null, and proper date strings.
- *sql-server-uniqueidentifier-to-uuid*
The SQL Server driver receives data fo type uniqueidentifier as byte
vector that we then need to convert to an UUID string for PostgreSQL
COPY input format to process.
- *unix-timestamp-to-timestamptz*
Converts a unix timestamp (number of seconds elapsed since beginning of
1970) into a proper PostgreSQL timestamp format.