pgloader/docs/howto/csv.html
Dimitri Fontaine 9e574ce884 Rename web/ into docs/
This allows to benefit from github pages without having to maintain a
separate orphaned branch.
2016-08-19 20:55:29 +02:00

164 lines
7.5 KiB
HTML

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="description" content="">
<meta name="author" content="">
<link rel="shortcut icon" href="../../docs-assets/ico/favicon.png">
<title>pgloader</title>
<!-- Bootstrap core CSS -->
<link href="../dist/css/bootstrap.css" rel="stylesheet">
<!-- Custom styles for this template -->
<link href="../dist/carousel.css" rel="stylesheet">
</head>
<!-- NAVBAR
================================================== -->
<body>
<div class="navbar-wrapper">
<div class="container">
<div class="navbar navbar-inverse navbar-static-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">pgloader</a>
</div>
<div class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li><a href="../index.html">Home</a></li>
<li><a href="quickstart.html">Quick Start</a></li>
<li><a href="pgloader.1.html">Reference documentation</a></li>
<li class="dropdown active">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Advanced HowTos <b class="caret"></b></a>
<ul class="dropdown-menu">
<li class="dropdown-header">Plain Files</li>
<li><a href="csv.html">CSV</a></li>
<li><a href="fixed.html">Fixed format</a></li>
<li><a href="geolite.html">Geolite</a></li>
<li class="divider"></li>
<li class="dropdown-header">Databases</li>
<li><a href="dBase.html">dBase</a></li>
<li><a href="sqlite.html">SQLite</a></li>
<li><a href="mysql.html">MySQL</a></li>
</ul>
</li>
<li><a href="../download.html">Download</a></li>
<li><a href="../sponsors.html">Sponsors</a></li>
<li><a href="../pgloader-moral-license.html">License</a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
<!-- an empty carousel -->
<div id="myCarousel" class="carousel slide" data-ride="carousel" style="height: 100px">
<div class="carousel-inner" style="height: 100px">
<div class="item active" style="height: 100px">
<img data-src="holder.js/900x100/auto/#777:#7a7a7a" style="height: 100px">
<!-- <div class="container"> -->
<!-- <div class="carousel-caption"> -->
<!-- <h1>Load data into PostgreSQL. Fast.</h1> -->
<!-- <p></p> -->
<!-- </div> -->
<!-- </div> -->
</div>
</div>
</div><!-- /.carousel -->
<div class="container">
<div class="row">
<div class="col-md-2"> </div>
<div class="col-md-8">
<h1>Loading CSV Data with pgloader</h1><p>CSV means <em>comma separated values</em> and is often found with quite varying specifications. pgloader allows you to describe those specs in its command. </p><h2>The Command</h2><p>To load data with <a href="http://pgloader.io/">pgloader</a> you need to define in a <em>command</em> the operations in some details. Here's our example for loading CSV data: </p><pre><code> LOAD CSV
FROM 'path/to/file.csv' (x, y, a, b, c, d)
INTO postgresql:///pgloader?csv (a, b, d, c)
WITH truncate,
skip header = 1,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by ','
SET client_encoding to 'latin1',
work_mem to '12MB',
standard_conforming_strings to 'on'
BEFORE LOAD DO
$$ drop table if exists csv; $$,
$$ create table csv (
a bigint,
b bigint,
c char(2),
d text
);
$$; </code></pre><p>You can see the full list of options in the <a href="pgloader.1.html">pgloader reference manual</a>, with a complete description of the options you see here. </p><h2>The Data</h2><p>This command allows loading the following CSV file content: </p><pre><code>Header, with a © sign
"2.6.190.56","2.6.190.63","33996344","33996351","GB","United Kingdom"
"3.0.0.0","4.17.135.31","50331648","68257567","US","United States"
"4.17.135.32","4.17.135.63","68257568","68257599","CA","Canada"
"4.17.135.64","4.17.142.255","68257600","68259583","US","United States"
"4.17.143.0","4.17.143.15","68259584","68259599","CA","Canada"
"4.17.143.16","4.18.32.71","68259600","68296775","US","United States" </code></pre><h2>Loading the data</h2><p>Here's how to start loading the data. Note that the ouput here has been edited so as to facilitate its browsing online. </p><pre><code>$ pgloader csv.load
... LOG Starting pgloader, log system is ready.
... LOG Parsing commands from file "/Users/dim/dev/pgloader/test/csv.load"
table name read imported errors time
----------------- --------- --------- --------- --------------
before load 2 2 0 0.039s
----------------- --------- --------- --------- --------------
csv 6 6 0 0.019s
----------------- --------- --------- --------- --------------
Total import time 6 6 0 0.058s </code></pre><h2>The result</h2><p>As you can see, the command described above is filtering the input and only importing some of the columns from the example data file. Here's what gets loaded in the PostgreSQL database: </p><pre><code>pgloader# table csv;
a | b | c | d
----------+----------+----+----------------
33996344 | 33996351 | GB | United Kingdom
50331648 | 68257567 | US | United States
68257568 | 68257599 | CA | Canada
68257600 | 68259583 | US | United States
68259584 | 68259599 | CA | Canada
68259600 | 68296775 | US | United States
(6 rows) </code></pre> </div>
<div class="col-md-2"> </div>
</div>
<!-- FOOTER -->
<footer>
<p class="pull-right"><a href="#">Back to top</a></p>
<p>&copy; 2013-2014 Dimitri Fontaine. &middot;</p>
</footer>
</div><!-- /.container -->
<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="https://code.jquery.com/jquery-1.10.2.min.js"></script>
<script src="../dist/js/bootstrap.min.js"></script>
<!-- <script src="docs-assets/js/holder.js"></script> -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-47059482-2', 'tapoueh.org');
ga('send', 'pageview');
</script>
</body>
</html>