mirror of
https://github.com/dimitri/pgloader.git
synced 2025-08-08 15:27:00 +02:00
The website is moving to pgloader.org and readthedocs.io is going to be integrated. Let's see what happens. The docs build fine locally with the sphinx tools and the docs/Makefile. Having separate files for the documentation should help ease the maintenance and add new topics, such as support for Common Lisp Hackers level docs, which are currently missing.
105 lines
3.5 KiB
ReStructuredText
105 lines
3.5 KiB
ReStructuredText
Loading From an Archive
|
|
=======================
|
|
|
|
This command instructs pgloader to load data from one or more files contained
|
|
in an archive. Currently the only supported archive format is *ZIP*, and the
|
|
archive might be downloaded from an *HTTP* URL.
|
|
|
|
Here's an example::
|
|
|
|
LOAD ARCHIVE
|
|
FROM /Users/dim/Downloads/GeoLiteCity-latest.zip
|
|
INTO postgresql:///ip4r
|
|
|
|
BEFORE LOAD
|
|
DO $$ create extension if not exists ip4r; $$,
|
|
$$ create schema if not exists geolite; $$,
|
|
|
|
EXECUTE 'geolite.sql'
|
|
|
|
LOAD CSV
|
|
FROM FILENAME MATCHING ~/GeoLiteCity-Location.csv/
|
|
WITH ENCODING iso-8859-1
|
|
(
|
|
locId,
|
|
country,
|
|
region null if blanks,
|
|
city null if blanks,
|
|
postalCode null if blanks,
|
|
latitude,
|
|
longitude,
|
|
metroCode null if blanks,
|
|
areaCode null if blanks
|
|
)
|
|
INTO postgresql:///ip4r?geolite.location
|
|
(
|
|
locid,country,region,city,postalCode,
|
|
location point using (format nil "(~a,~a)" longitude latitude),
|
|
metroCode,areaCode
|
|
)
|
|
WITH skip header = 2,
|
|
fields optionally enclosed by '"',
|
|
fields escaped by double-quote,
|
|
fields terminated by ','
|
|
|
|
AND LOAD CSV
|
|
FROM FILENAME MATCHING ~/GeoLiteCity-Blocks.csv/
|
|
WITH ENCODING iso-8859-1
|
|
(
|
|
startIpNum, endIpNum, locId
|
|
)
|
|
INTO postgresql:///ip4r?geolite.blocks
|
|
(
|
|
iprange ip4r using (ip-range startIpNum endIpNum),
|
|
locId
|
|
)
|
|
WITH skip header = 2,
|
|
fields optionally enclosed by '"',
|
|
fields escaped by double-quote,
|
|
fields terminated by ','
|
|
|
|
FINALLY DO
|
|
$$ create index blocks_ip4r_idx on geolite.blocks using gist(iprange); $$;
|
|
|
|
The `archive` command accepts the following clauses and options.
|
|
|
|
Archive Source Specification: FROM
|
|
----------------------------------
|
|
|
|
Filename or HTTP URI where to load the data from. When given an HTTP URL the
|
|
linked file will get downloaded locally before processing.
|
|
|
|
If the file is a `zip` file, the command line utility `unzip` is used to
|
|
expand the archive into files in `$TMPDIR`, or `/tmp` if `$TMPDIR` is unset
|
|
or set to a non-existing directory.
|
|
|
|
Then the following commands are used from the top level directory where the
|
|
archive has been expanded.
|
|
|
|
Archive Sub Commands
|
|
--------------------
|
|
|
|
- command [ *AND* command ... ]
|
|
|
|
A series of commands against the contents of the archive, at the moment
|
|
only `CSV`,`'FIXED` and `DBF` commands are supported.
|
|
|
|
Note that commands are supporting the clause *FROM FILENAME MATCHING*
|
|
which allows the pgloader command not to depend on the exact names of
|
|
the archive directories.
|
|
|
|
The same clause can also be applied to several files with using the
|
|
spelling *FROM ALL FILENAMES MATCHING* and a regular expression.
|
|
|
|
The whole *matching* clause must follow the following rule::
|
|
|
|
FROM [ ALL FILENAMES | [ FIRST ] FILENAME ] MATCHING
|
|
|
|
Archive Final SQL Commands
|
|
--------------------------
|
|
|
|
- *FINALLY DO*
|
|
|
|
SQL Queries to run once the data is loaded, such as `CREATE INDEX`.
|
|
|