pgloader/test/citus
Dimitri Fontaine 3f2f10eef1 Finish implementation of CAST rules for PostgreSQL source databases.
Add a link to the table from the internal catalogs for columns so that we
can match table-source-name in cast rules when migrating from PostgreSQL.
2018-11-19 19:33:37 +01:00
..
.gitignore Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00
company.load Finish implementation of CAST rules for PostgreSQL source databases. 2018-11-19 19:33:37 +01:00
company.sql Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00
copy.sql Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00
data.load Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00
Makefile Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00
README.md Add a Citus distribution test case, from the citus tutorial. 2018-10-18 15:42:17 +02:00

Citus Multi-Tenant Automatic Distribution

In this test case we follow the following documentation:

https://docs.citusdata.com/en/v7.5/use_cases/multi_tenant.html

We install the schema before Citus migration, and load the data without the backfilling that is already done. For that we use pgloader to ignore the company_id column in the tables that didn't have this column prior to the Citus migration effort.

Then the following company.load file contains the pgloader command that runs a full migration from PostgreSQL to Citus:

load database
   from pgsql:///hackathon
   into pgsql://localhost:9700/dim

   with include drop, reset no sequences

   distribute companies using id;

Tables are marked distributed, the company_id column is added where it's needed, primary keys and foreign keys definitions are altered to the new model, and finally the data is backfilled automatically in the target table thanks to generating queries like the following:

SELECT "campaigns".company_id::text,
       "impressions".id::text,
       "impressions".ad_id::text,
       "impressions".seen_at::text,
       "impressions".site_url::text,
       "impressions".cost_per_impression_usd::text,
       "impressions".user_ip::text,
       "impressions".user_data::text
  FROM "public"."impressions"  
        JOIN "public"."ads" ON impressions.ad_id = ads.id
        JOIN "public"."campaigns" ON ads.campaign_id = campaigns.id