mirror of
https://github.com/dimitri/pgloader.git
synced 2026-05-05 02:46:10 +02:00
Improve parallelism setup documentation.
The code comment displayed in the release notes for 3.3.1 is reported to be better at explaining the concurrency control than what we had in the main documentation, so add it there. Fix #496.
This commit is contained in:
parent
21a10235db
commit
effa916b31
2
Makefile
2
Makefile
@ -230,4 +230,4 @@ latest:
|
||||
|
||||
check: test ;
|
||||
|
||||
.PHONY: test pgloader-standalone
|
||||
.PHONY: test pgloader-standalone docs
|
||||
|
||||
18
pgloader.1
18
pgloader.1
@ -1,7 +1,7 @@
|
||||
.\" generated with Ronn/v0.7.3
|
||||
.\" http://github.com/rtomayko/ronn/tree/0.7.3
|
||||
.
|
||||
.TH "PGLOADER" "1" "December 2016" "ff" ""
|
||||
.TH "PGLOADER" "1" "January 2017" "ff" ""
|
||||
.
|
||||
.SH "NAME"
|
||||
\fBpgloader\fR \- PostgreSQL data loader
|
||||
@ -487,7 +487,21 @@ At the moment, the number of transformer and writer tasks are forced into being
|
||||
The parameter \fIworkers\fR allows to control how many worker threads are allowed to be active at any time (that\'s the parallelism level); and the parameter \fIconcurrency\fR allows to control how many tasks are started to handle the data (they may not all run at the same time, depending on the \fIworkers\fR setting)\.
|
||||
.
|
||||
.P
|
||||
With a \fIconcurrency\fR of 2, we start 1 reader thread, 2 transformer threads and 2 writer tasks, that\'s 5 concurrent tasks to schedule into \fIworkers\fR threads\.
|
||||
We allow \fIworkers\fR simultaneous workers to be active at the same time in the context of a single table\. A single unit of work consist of several kinds of workers:
|
||||
.
|
||||
.IP "\(bu" 4
|
||||
a reader getting raw data from the source,
|
||||
.
|
||||
.IP "\(bu" 4
|
||||
N transformers preparing raw data for PostgreSQL COPY protocol,
|
||||
.
|
||||
.IP "\(bu" 4
|
||||
N writers sending the data down to PostgreSQL\.
|
||||
.
|
||||
.IP "" 0
|
||||
.
|
||||
.P
|
||||
The N here is setup to the \fIconcurrency\fR parameter: with a \fICONCURRENCY\fR of 2, we start (+ 1 2 2) = 5 concurrent tasks, with a \fIconcurrency\fR of 4 we start (+ 1 4 4) = 9 concurrent tasks, of which only \fIworkers\fR may be active simultaneously\.
|
||||
.
|
||||
.P
|
||||
So with \fBworkers = 4, concurrency = 2\fR, the parallel scheduler will maintain active only 4 of the 5 tasks that are started\.
|
||||
|
||||
@ -433,9 +433,18 @@ parameter *concurrency* allows to control how many tasks are started to
|
||||
handle the data (they may not all run at the same time, depending on the
|
||||
*workers* setting).
|
||||
|
||||
With a *concurrency* of 2, we start 1 reader thread, 2 transformer threads
|
||||
and 2 writer tasks, that's 5 concurrent tasks to schedule into *workers*
|
||||
threads.
|
||||
We allow *workers* simultaneous workers to be active at the same time in the
|
||||
context of a single table. A single unit of work consist of several kinds of
|
||||
workers:
|
||||
|
||||
- a reader getting raw data from the source,
|
||||
- N transformers preparing raw data for PostgreSQL COPY protocol,
|
||||
- N writers sending the data down to PostgreSQL.
|
||||
|
||||
The N here is setup to the *concurrency* parameter: with a *CONCURRENCY* of
|
||||
2, we start (+ 1 2 2) = 5 concurrent tasks, with a *concurrency* of 4 we
|
||||
start (+ 1 4 4) = 9 concurrent tasks, of which only *workers* may be active
|
||||
simultaneously.
|
||||
|
||||
So with `workers = 4, concurrency = 2`, the parallel scheduler will
|
||||
maintain active only 4 of the 5 tasks that are started.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user