Currently, planning instructs to create all records even
those which does not match any zone.
Later, those records will be checked towards the existing
records and filtered whether they match or not a hosted zone.
This causes a problem, at least in the specific case of the Route53
implementation as it always calls the ApplyChanges method, which in its
turn always retrieves all records in all zones.
This causes high pressure on Route53 APIs, for non-necessary actions.
By being able to filter all unmanaged records from the plan, we can
prevent from calling ApplyChanges when nothing has to be done and hence
prevent an unnecessary listing of records.
By doing so, the rate of API calls to AWS Route53 is expected to be
reduced by 2
From measurements, AWS by default has pagination of 100 items per
page when listing hosted zone resources.
This increases the number of requests required to list all our zones,
and pushes a hard constraint on the rate limits.
From the experiments, it seems that on the server-side, there is a hard
limit of 300 elements per page, as per AWS documentation:
https://docs.aws.amazon.com/Route53/latest/APIReference/API_ListResourceRecordSets.html
> ListResourceRecordSets returns up to 300 resource record sets at a time in ASCII order,
> beginning at a position specified by the name and type elements
Hence raising the page size from 100 to 300 items would decrease by 3
the number of requests posted to Route53
We even set a higher limit so we can benefit from a lower number of
requests if ever AWS increases the hard limit of 300.
* adds tests for shouldUpdateProviderSpecific
Signed-off-by: Raffaele Di Fazio <difazio.raffaele@gmail.com>
* move AWS health to where it belongs
Signed-off-by: Raffaele Di Fazio <difazio.raffaele@gmail.com>
* add test that breaks things
Signed-off-by: Raffaele Di Fazio <difazio.raffaele@gmail.com>
* adds adjustendpoints method
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* fix controller
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* actually pass the provider where needed
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* OMG goland do your go fmt thing
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* use registry as proxy
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* make linter happy
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* change AdjustEndpoints signature
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* fix typo
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* actually use adjusted endpoints
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* revert cloudflare change
Signed-off-by: Raffaele Di Fazio <raffo@github.com>
* Update provider/cloudflare/cloudflare.go
Co-authored-by: Nick Jüttner <nick@juni.io>
Co-authored-by: Nick Jüttner <nick@juni.io>
Currently the AWS provider cannot handle record type changes. It always
attempts to UPSERT such updates, which will fail the entire zone batch
of changes. As a result, a single resource change can break all the
updates for the entire zone.
This change modifies the AWS behavior to correctly identify when the
record type changes and perform a batched DELETE and CREATE to update
the record successfully.
Special logic is required to handle ALIAS records which are not directly
encoded by the generic external-dns code, and relies on
convention (using a CNAME record type internally). I'm not sure this is
ideal as it's fairly error prone, and would prefer to see direct support
for such ALIAS types, but I've left it alone in this change.
When it syncs AWS DNS with k8s cluster content (at `--interval`), external-dns submits two distinct Route53 API calls:
* to fetch available zones (eg. for tag based zones discovery, or when zones are created after exernal-dns started),
* to fetch relevant zones' resource records.
Each call taxes the Route53 APIs calls budget (5 API calls per second per AWS account/region hard limit), increasing the probability of being throttled.
Changing synchronization interval would mitigate those calls' impact, but at the cost of keeping stale records for a longer time.
For most practical uses cases, zones list aren't expected to change frequently.
Even less so when external-dns is provided an explicit, static zones set (`--zone-id-filter` rather than `--aws-zone-tags`).
Using a zones list cache halves the number of Route53 read API calls.
When faced with errors from cloud providers (like "Throttling: Rate exceeded"), it's not always easy to find what operation caused the failure, and what action was aborted, if any,
Let's make it easier to identify an error source (and affected object when possible) by providing more context (and by using easy to find error messages).