mirror of
https://github.com/dimitri/pgloader.git
synced 2025-08-08 15:27:00 +02:00
This allows to benefit from github pages without having to maintain a separate orphaned branch.
172 lines
9.2 KiB
HTML
172 lines
9.2 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta http-equiv="X-UA-Compatible" content="IE=edge">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
<meta name="description" content="">
|
|
<meta name="author" content="">
|
|
<link rel="shortcut icon" href="../../docs-assets/ico/favicon.png">
|
|
|
|
<title>pgloader</title>
|
|
|
|
<!-- Bootstrap core CSS -->
|
|
<link href="../dist/css/bootstrap.css" rel="stylesheet">
|
|
|
|
<!-- Custom styles for this template -->
|
|
<link href="../dist/carousel.css" rel="stylesheet">
|
|
</head>
|
|
<!-- NAVBAR
|
|
================================================== -->
|
|
<body>
|
|
<div class="navbar-wrapper">
|
|
<div class="container">
|
|
|
|
<div class="navbar navbar-inverse navbar-static-top" role="navigation">
|
|
<div class="container">
|
|
<div class="navbar-header">
|
|
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
|
|
<span class="sr-only">Toggle navigation</span>
|
|
<span class="icon-bar"></span>
|
|
<span class="icon-bar"></span>
|
|
<span class="icon-bar"></span>
|
|
</button>
|
|
<a class="navbar-brand" href="../index.html">pgloader</a>
|
|
</div>
|
|
<div class="navbar-collapse collapse">
|
|
<ul class="nav navbar-nav">
|
|
<li><a href="../index.html">Home</a></li>
|
|
<li><a href="quickstart.html">Quick Start</a></li>
|
|
<li><a href="pgloader.1.html">Reference documentation</a></li>
|
|
<li class="dropdown active">
|
|
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Advanced HowTos <b class="caret"></b></a>
|
|
<ul class="dropdown-menu">
|
|
<li class="dropdown-header">Plain Files</li>
|
|
<li><a href="csv.html">CSV</a></li>
|
|
<li><a href="fixed.html">Fixed format</a></li>
|
|
<li><a href="geolite.html">Geolite</a></li>
|
|
<li class="divider"></li>
|
|
<li class="dropdown-header">Databases</li>
|
|
<li><a href="dBase.html">dBase</a></li>
|
|
<li><a href="sqlite.html">SQLite</a></li>
|
|
<li><a href="mysql.html">MySQL</a></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="../download.html">Download</a></li>
|
|
<li><a href="../sponsors.html">Sponsors</a></li>
|
|
<li><a href="../pgloader-moral-license.html">License</a></li>
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
|
|
</div>
|
|
</div>
|
|
|
|
<!-- an empty carousel -->
|
|
<div id="myCarousel" class="carousel slide" data-ride="carousel" style="height: 100px">
|
|
<div class="carousel-inner" style="height: 100px">
|
|
<div class="item active" style="height: 100px">
|
|
<img data-src="holder.js/900x100/auto/#777:#7a7a7a" style="height: 100px">
|
|
<!-- <div class="container"> -->
|
|
<!-- <div class="carousel-caption"> -->
|
|
<!-- <h1>Load data into PostgreSQL. Fast.</h1> -->
|
|
<!-- <p></p> -->
|
|
<!-- </div> -->
|
|
<!-- </div> -->
|
|
</div>
|
|
</div>
|
|
</div><!-- /.carousel -->
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-md-2"> </div>
|
|
<div class="col-md-8">
|
|
<h1>Loading Fixed Width Data File with pgloader</h1><p>Some data providers still use a format where each column is specified with a starting index position and a given length. Usually the columns are blank-padded when the data is shorter than the full reserved range. </p><h2>The Command</h2><p>To load data with <a href="http://pgloader.io/">pgloader</a> you need to define in a <em>command</em> the operations in some details. Here's our example for loading Fixed Width Data, using a file provided by the US census. </p><p>You can find more files from them at the <a href="http://www.census.gov/geo/maps-data/data/gazetteer2000.html">Census 2000 Gazetteer Files</a>. </p><p>Here's our command: </p><pre><code>LOAD ARCHIVE
|
|
FROM http://www.census.gov/geo/maps-data/data/docs/gazetteer/places2k.zip
|
|
INTO postgresql:///pgloader
|
|
|
|
BEFORE LOAD DO
|
|
$$ drop table if exists places; $$,
|
|
$$ create table places
|
|
(
|
|
usps char(2) not null,
|
|
fips char(2) not null,
|
|
fips_code char(5),
|
|
loc_name varchar(64)
|
|
);
|
|
$$
|
|
|
|
LOAD FIXED
|
|
FROM FILENAME MATCHING ~/places2k.txt/
|
|
WITH ENCODING latin1
|
|
(
|
|
usps from 0 for 2,
|
|
fips from 2 for 2,
|
|
fips_code from 4 for 5,
|
|
"LocationName" from 9 for 64 [trim right whitespace],
|
|
p from 73 for 9,
|
|
h from 82 for 9,
|
|
land from 91 for 14,
|
|
water from 105 for 14,
|
|
ldm from 119 for 14,
|
|
wtm from 131 for 14,
|
|
lat from 143 for 10,
|
|
long from 153 for 11
|
|
)
|
|
INTO postgresql:///pgloader?places
|
|
(
|
|
usps, fips, fips_code, "LocationName"
|
|
); </code></pre><p>You can see the full list of options in the <a href="pgloader.1.html">pgloader reference manual</a>, with a complete description of the options you see here. </p><h2>The Data</h2><p>This command allows loading the following file content, where we are only showing the first couple of lines: </p><pre><code>AL0100124Abbeville city 2987 1353 40301945 120383 15.560669 0.046480 31.566367 -85.251300
|
|
AL0100460Adamsville city 4965 2042 50779330 14126 19.606010 0.005454 33.590411 -86.949166
|
|
AL0100484Addison town 723 339 9101325 0 3.514041 0.000000 34.200042 -87.177851
|
|
AL0100676Akron town 521 239 1436797 0 0.554750 0.000000 32.876425 -87.740978
|
|
AL0100820Alabaster city 22619 8594 53023800 141711 20.472605 0.054715 33.231162 -86.823829
|
|
AL0100988Albertville city 17247 7090 67212867 258738 25.951034 0.099899 34.265362 -86.211261
|
|
AL0101132Alexander City city 15008 6855 100534344 433413 38.816529 0.167342 32.933157 -85.936008 </code></pre><h2>Loading the data</h2><p>Let's start the <code>pgloader</code> command with our <code>census-places.load</code> command file: </p><pre><code>$ pgloader census-places.load
|
|
... LOG Starting pgloader, log system is ready.
|
|
... LOG Parsing commands from file "/Users/dim/dev/pgloader/test/census-places.load"
|
|
... LOG Fetching 'http://www.census.gov/geo/maps-data/data/docs/gazetteer/places2k.zip'
|
|
... LOG Extracting files from archive '//private/var/folders/w7/9n8v8pw54t1gngfff0lj16040000gn/T/pgloader//places2k.zip'
|
|
|
|
table name read imported errors time
|
|
----------------- --------- --------- --------- --------------
|
|
download 0 0 0 1.494s
|
|
extract 0 0 0 1.013s
|
|
before load 2 2 0 0.013s
|
|
----------------- --------- --------- --------- --------------
|
|
places 25375 25375 0 0.499s
|
|
----------------- --------- --------- --------- --------------
|
|
Total import time 25375 25375 0 3.019s </code></pre><p>We can see that <a href="pgloader">http://pgloader.io</a> did download the file from its HTTP URL location then <em>unziped</em> it before the loading itself. </p><p>Note that the output of the command has been edited to facilitate its browsing online. </p> </div>
|
|
<div class="col-md-2"> </div>
|
|
</div>
|
|
|
|
<!-- FOOTER -->
|
|
<footer>
|
|
<p class="pull-right"><a href="#">Back to top</a></p>
|
|
<p>© 2013-2014 Dimitri Fontaine. ·</p>
|
|
</footer>
|
|
|
|
</div><!-- /.container -->
|
|
|
|
|
|
<!-- Bootstrap core JavaScript
|
|
================================================== -->
|
|
<!-- Placed at the end of the document so the pages load faster -->
|
|
<script src="https://code.jquery.com/jquery-1.10.2.min.js"></script>
|
|
<script src="../dist/js/bootstrap.min.js"></script>
|
|
<!-- <script src="docs-assets/js/holder.js"></script> -->
|
|
|
|
<script>
|
|
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
|
|
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
|
|
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
|
|
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
|
|
|
|
ga('create', 'UA-47059482-2', 'tapoueh.org');
|
|
ga('send', 'pageview');
|
|
|
|
</script>
|
|
</body>
|
|
</html>
|