pgloader/docs/ref/archive.rst
Dimitri Fontaine 25c79dfebc Switch the documentation to using Sphinx.
The website is moving to pgloader.org and readthedocs.io is going to be
integrated. Let's see what happens. The docs build fine locally with the
sphinx tools and the docs/Makefile.

Having separate files for the documentation should help ease the maintenance
and add new topics, such as support for Common Lisp Hackers level docs,
which are currently missing.
2017-12-21 17:45:09 +01:00

105 lines
3.5 KiB
ReStructuredText

Loading From an Archive
=======================
This command instructs pgloader to load data from one or more files contained
in an archive. Currently the only supported archive format is *ZIP*, and the
archive might be downloaded from an *HTTP* URL.
Here's an example::
LOAD ARCHIVE
FROM /Users/dim/Downloads/GeoLiteCity-latest.zip
INTO postgresql:///ip4r
BEFORE LOAD
DO $$ create extension if not exists ip4r; $$,
$$ create schema if not exists geolite; $$,
EXECUTE 'geolite.sql'
LOAD CSV
FROM FILENAME MATCHING ~/GeoLiteCity-Location.csv/
WITH ENCODING iso-8859-1
(
locId,
country,
region null if blanks,
city null if blanks,
postalCode null if blanks,
latitude,
longitude,
metroCode null if blanks,
areaCode null if blanks
)
INTO postgresql:///ip4r?geolite.location
(
locid,country,region,city,postalCode,
location point using (format nil "(~a,~a)" longitude latitude),
metroCode,areaCode
)
WITH skip header = 2,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by ','
AND LOAD CSV
FROM FILENAME MATCHING ~/GeoLiteCity-Blocks.csv/
WITH ENCODING iso-8859-1
(
startIpNum, endIpNum, locId
)
INTO postgresql:///ip4r?geolite.blocks
(
iprange ip4r using (ip-range startIpNum endIpNum),
locId
)
WITH skip header = 2,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by ','
FINALLY DO
$$ create index blocks_ip4r_idx on geolite.blocks using gist(iprange); $$;
The `archive` command accepts the following clauses and options.
Archive Source Specification: FROM
----------------------------------
Filename or HTTP URI where to load the data from. When given an HTTP URL the
linked file will get downloaded locally before processing.
If the file is a `zip` file, the command line utility `unzip` is used to
expand the archive into files in `$TMPDIR`, or `/tmp` if `$TMPDIR` is unset
or set to a non-existing directory.
Then the following commands are used from the top level directory where the
archive has been expanded.
Archive Sub Commands
--------------------
- command [ *AND* command ... ]
A series of commands against the contents of the archive, at the moment
only `CSV`,`'FIXED` and `DBF` commands are supported.
Note that commands are supporting the clause *FROM FILENAME MATCHING*
which allows the pgloader command not to depend on the exact names of
the archive directories.
The same clause can also be applied to several files with using the
spelling *FROM ALL FILENAMES MATCHING* and a regular expression.
The whole *matching* clause must follow the following rule::
FROM [ ALL FILENAMES | [ FIRST ] FILENAME ] MATCHING
Archive Final SQL Commands
--------------------------
- *FINALLY DO*
SQL Queries to run once the data is loaded, such as `CREATE INDEX`.