From 8038931f5a05521508d4c0ffbc46f376cc8651d3 Mon Sep 17 00:00:00 2001 From: Jon Dufresne Date: Thu, 2 Apr 2015 09:15:06 -0700 Subject: [PATCH] Remove trailing whitespace. --- pgloader.1.md | 136 +++++++++++++++++++++++++------------------------- 1 file changed, 68 insertions(+), 68 deletions(-) diff --git a/pgloader.1.md b/pgloader.1.md index b21446b..6e3a648 100644 --- a/pgloader.1.md +++ b/pgloader.1.md @@ -16,7 +16,7 @@ the data into the server, and manages errors by filling a pair of pgloader operates either using commands which are read from files: pgloader commands.load - + or by using arguments and options all provided on the command line: pgloader SOURCE TARGET @@ -32,7 +32,7 @@ The source connection string format is as follows: format:///absolute/path/to/file.ext format://./relative/path/to/file.ext - + Where format might be one of `csv`, `fixed`, `copy`, `dbf`, `db3` or `ixf`. db://user:pass@host:port/dbname @@ -108,7 +108,7 @@ Those options are meant to tweak `pgloader` behavior when loading data. * `-S`, `--summary`: A filename where to copy the summary output. When relative, the filename is expanded into `*root-dir*`. - + The format of the filename defaults to being *human readable*. It is possible to have the output in machine friendly formats such as *CSV*, *COPY* (PostgreSQL's own COPY format) or *JSON* by specifying a filename @@ -136,47 +136,47 @@ parser. In simple cases, it can be much easier to use the *SOURCE* and options: * `--with "option"`: - + Allows setting options from the command line. You can use that option as many times as you want. The option arguments must follow the *WITH* clause for the source type of the `SOURCE` specification, as described later in this document. - + * `--set "guc_name='value'"` - + Allows setting PostgreSQL configuration from the command line. Note that the option parsing is the same as when used from the *SET* command clause, in particular you must enclose the guc value with single-quotes. - + * `--field "..."` - + Allows setting a source field definition. Fields are accumulated in the order given on the command line. It's possible to either use a `--field` option per field in the source file, or to separate field definitions by a comma, as you would do in the *HAVING FIELDS* clause. - + * `--cast "..."` - + Allows setting a specific casting rule for loading the data. * `--type csv|fixed|db3|ixf|sqlite|mysql|mssql` - + Allows forcing the source type, in case when the *SOURCE* parsing isn't satisfying. - + * `--encoding ` - + Set the encoding of the source file to load data from. - + * `--before ` - + Parse given filename for SQL queries and run them against the target database before loading the data from the source. The queries are parsed by pgloader itself: they need to be terminated by a semi-colon (;) and the file may include `\i` or `\ir` commands to *include* another file. - + * `--after ` - + Parse given filename for SQL queries and run them against the target database after having loaded the data from the source. The queries are parsed in the same way as with the `--before` option, see above. @@ -201,7 +201,7 @@ Review the command line options and pgloader's version: Use the command file as the pgloader command argument, pgloader will parse that file and execute the commands found in it: - pgloader --verbose ./test/csv-districts.load + pgloader --verbose ./test/csv-districts.load ### CSV @@ -395,18 +395,18 @@ pgloader supports the following input formats: - csv, which includes also tsv and other common variants where you can change the *separator* and the *quoting* rules and how to *escape* the *quotes* themselves; - + - fixed columns file, where pgloader is flexible enough to accomodate with source files missing columns (*ragged fixed length column files* do exist); - + - PostgreSLQ COPY formatted files, following the COPY TEXT documentation of PostgreSQL, such as the reject files prepared by pgloader; - + - dbase files known as db3 or dbf file; - + - ixf formated files, ixf being a binary storage format from IBM; - + - sqlite databases with fully automated discovery of the schema and advanced cast rules; @@ -436,7 +436,7 @@ options. [ WITH ] [ SET ] - + [ BEFORE LOAD DO|EXECUTE [ | ] [ AFTER LOAD DO|EXECUTE [ | ] ; @@ -450,13 +450,13 @@ specific clauses such as the `CAST` clause. Some clauses are common to all commands: - *FROM* - + The *FROM* clause specifies where to read the data from, and each command introduces its own variant of sources. For instance, the *CSV* source supports `inline`, `stdin`, a filename, a quoted filename, and a *FILENAME MATCHING* clause (see above); whereas the *MySQL* source only supports a MySQL database URI specification. - + In all cases, the *FROM* clause is able to read its value from an environment variable when using the form `GETENV 'varname'`. @@ -465,7 +465,7 @@ Some clauses are common to all commands: The PostgreSQL connection URI must contains the name of the target table where to load the data into. That table must have already been created in PostgreSQL, and the name might be schema qualified. - + The *INTO* target database connection URI can be parsed from the value of an environment variable when using the form `GETENV 'varname'`. @@ -480,21 +480,21 @@ Some clauses are common to all commands: dynamically with the `--load` command line parameter. Each *USING* expression is compiled at runtime to native code. - + This feature allows pgloader to load any number of fields in a CSV file into a possibly different number of columns in the database, using custom code for that projection. - *WITH* - + Set of options to apply to the command, using a global syntax of either: - + - *key = value* - *use option* - *do not use option* - + See each specific command for details. - + - *SET* This clause allows to specify session parameters to be set for all the @@ -548,7 +548,7 @@ Where: Can contain any character, including colon (`:`) which must then be doubled (`::`) and at-sign (`@`) which must then be doubled (`@@`). - + When omitted, the *user* name defaults to the value of the `PGUSER` environment variable, and if it is unset, the value of the `USER` environment variable. @@ -591,17 +591,17 @@ Where: determined above. - *options* - + The optional parameters must be supplied with the form `name=value`, and you may use several parameters by separating them away using an ampersand (`&`) character. - + Only two options are supported here, *tablename* (which might be qualified with a schema name) and *sslmode*. - + The *sslmode* parameter values can be one of `disable`, `allow`, `prefer` or `require`. - + For backward compatibility reasons, it's possible to specify the *tablename* option directly, without spelling out the `tablename=` parts. @@ -660,31 +660,31 @@ data source kind, such as the CSV *skip header* option. The global batch behaviour options are: - *batch rows* - + Takes a numeric value as argument, used as the maximum number of rows allowed in a batch. The default is `25 000` and can be changed to try having better performances characteristics or to control pgloader memory usage; - + - *batch size* - + Takes a memory unit as argument, such as *20 MB*, its default value. Accepted multipliers are *kB*, *MB*, *GB*, *TB* and *PB*. The case is important so as not to be confused about bits versus bytes, we're only talking bytes here. - + - *batch concurrency* - + Takes a numeric value as argument, defaults to `10`. That's the number of batches that pgloader is allows to build in memory, even when only a single batch at a time might be sent to PostgreSQL. - + Supporting more than a single batch being sent at a time is on the TODO list of pgloader, but is not implemented yet. This option is about controlling the memory needs of pgloader as a trade-off to the performances characteristics, and not about parallel activity of pgloader. - + Other options are specific to each input source, please refer to specific parts of the documentation for their listing and covering. @@ -751,7 +751,7 @@ The `csv` format command accepts the following clauses and options: for exact syntax, several options can be used here) to filenames. It's then possible to load data from only the first match of all of them. - + The optional *IN DIRECTORY* clause allows specifying which directory to walk for finding the data files, and can be either relative to where the command file is read from, or absolute. The given @@ -779,7 +779,7 @@ The `csv` format command accepts the following clauses and options: Date format string are template strings modeled against the PostgreSQL `to_char` template strings support, limited to the following patterns: - + - YYYY, YYY, YY for the year part - MM for the numeric month part - DD for the numeric day part @@ -791,9 +791,9 @@ The `csv` format command accepts the following clauses and options: - MS for the milliseconds part (4 digits) - US for the microseconds part (6 digits) - unparsed punctuation signs: - . * # @ T / \ and space - + Here's an example of a *date format* specification: - + column-name [date format 'YYYY-MM-DD HH24-MI-SS.US'] - *null if* @@ -994,7 +994,7 @@ The `fixed` format command accepts the following clauses and options: Date format string are template strings modeled against the PostgreSQL `to_char` template strings support, limited to the following patterns: - + - YYYY, YYY, YY for the year part - MM for the numeric month part - DD for the numeric day part @@ -1006,9 +1006,9 @@ The `fixed` format command accepts the following clauses and options: - MS for the milliseconds part (4 digits) - US for the microseconds part (6 digits) - unparsed punctuation signs: - . * # @ T / \ and space - + Here's an example of a *date format* specification: - + column-name [date format 'YYYY-MM-DD HH24-MI-SS.US'] - *null if* @@ -1068,13 +1068,13 @@ data as described in the PostgreSQL documentation. Here's an example: milliseconds, bytes, unitprice ) INTO postgresql:///pgloader?track_full - + WITH truncate - + SET client_encoding to 'latin1', work_mem to '14MB', standard_conforming_strings to 'on' - + BEFORE LOAD DO $$ drop table if exists track_full; $$, $$ create table track_full ( @@ -1100,9 +1100,9 @@ The `COPY` format command accepts the following clauses and options: - *WITH* - + When loading from a `COPY` file, the following options are supported: - + - *truncate* When this option is listed, pgloader issues a `TRUNCATE` command @@ -1185,7 +1185,7 @@ an example: FROM data/nsitra.test1.ixf INTO postgresql:///pgloader?nsitra.test1 WITH truncate, create table - + BEFORE LOAD DO $$ create schema if not exists nsitra; $$, $$ drop table if exists nsitra.test1; $$; @@ -1467,7 +1467,7 @@ The `database` command accepts the following clauses and options: In somes cases like when the DDL are entirely left to a framework it might be sensible for pgloader to refrain from handling index unique names, that is achieved by using the *preserve index names* option. - + The default is to *uniquify index names*. - *foreign keys* @@ -1650,7 +1650,7 @@ The `database` command accepts the following clauses and options: be dropped at the end of the data loading. - *MATERIALIZE ALL VIEWS* - + Same behaviour as *MATERIALIZE VIEWS* using the dynamic list of views as returned by MySQL rather than asking the user to specify the list. @@ -1905,9 +1905,9 @@ The `sqlite` command accepts the following clauses and options: When this option is listed pgloader only issues the `COPY` statements, without doing any other processing. - + - *encoding* - + This option allows to control which encoding to parse the SQLite text data with. Defaults to UTF-8. @@ -1982,7 +1982,7 @@ Here's an example: into postgresql:///dbname including only table names like 'GlobalAccount' in schema 'dbo' - + set work_mem to '16MB', maintenance_work_mem to '512 MB' before load do $$ drop schema if exists dbo cascade; $$; @@ -2094,7 +2094,7 @@ The provided transformation functions are: In: "20041002152952" Out: "2004-10-02 15:29:52" - + - *time-with-no-separator* Transform the given time into a format that PostgreSQL will actually @@ -2102,7 +2102,7 @@ The provided transformation functions are: In: "08231560" Out: "08:23:15.60" - + - *tinyint-to-boolean* As MySQL lacks a proper boolean type, *tinyint* is often used to @@ -2167,19 +2167,19 @@ The provided transformation functions are: http://www.postgresql.org/docs/9.3/interactive/datatype-binary.html - *sqlite-timestamp-to-timestamp* - + SQLite type system is quite interesting, so cope with it here to produce timestamp literals as expected by PostgreSQL. That covers year only on 4 digits, 0 dates to null, and proper date strings. - *sql-server-uniqueidentifier-to-uuid* - + The SQL Server driver receives data fo type uniqueidentifier as byte vector that we then need to convert to an UUID string for PostgreSQL COPY input format to process. - + - *unix-timestamp-to-timestamptz* - + Converts a unix timestamp (number of seconds elapsed since beginning of 1970) into a proper PostgreSQL timestamp format.