syncstorage-rs

mirror of https://github.com/mozilla-services/syncstorage-rs.git synced 2025-08-10 05:46:56 +02:00

Author	SHA1	Message	Date
JR Conlin	b0f1590f4a	feat: Allow for failure "replay" from failure file (#644 ) New option: `--retry_file=` takes a previous failure file and will retry the bso/UIDs contained in it. Closes #642	2020-06-03 10:30:46 -07:00
JR Conlin	fa96964f07	bug: Make `bso_num` in migrate_node less truthy (#637 ) Closes #636	2020-05-14 16:05:54 -07:00
JR Conlin	8aaa4492e9	User migration5 (#601 ) * bug: Fix typos in tick, string replacements * f multi-tread gen_bso_users * added `--start_bso`, `--end_bso` to `gen_bso_users.py` * added `bso_num` arg (same as `--start_bso=# --end_bso=#`) to `migrate_node.py` * `gen_bso_users.py` takes same `bso_users_file` template as `migrate_node.py` * f remove default value for BSO_Users.run bso_num * f fix lock issue in gen_bso_users, trap for `` states in gen_fxa_users * f make threading optional. There's a locking issue that appears to be inside of the mysql. Turning threading off for now (can be run in parallel) * f fix tick, threading flag * f rename confusing args in gen_bso and gen_fxa gen_bso_users: `--bso_users_file` => `--output_file` gen_fxa_users: `--fxa_file` => `--users_file` `--fxa_users_file` => `--output_file` * f more tick fixes * f don't use threading on Report if threading isn't available. * f make `--bso_users_file` / `--fxa_users_file` consistent * `--bso_user_file` is now `--bso_users_file` Issue #407	2020-04-29 13:09:07 -07:00
Philip Jenvey	16058f20a4	feat: add a --wipe_user mode deletes pre-existing user data on spanner before migrating. only usable in --user mode. and fix parsing of the new gen_fxa_users.py output Closes #596	2020-04-20 11:41:36 -07:00
jrconlin	c4ffdb636a	f break apart migrate_node into submodules Yeah, this one's full of stuff. * `gen_fxa_users.py` takes the tokendata file, and generates a file containing a the converted uid => fxa_uid/fxa_kid values. See `gen_fxa_users.py --help` for arguments. * `gen_bso_users.py` takes the generated `fxa_users_{date}.lst` file from `gen_fxa_users.py` pulls the users from the `--bso_num` and dumps them to `bso_users_{bso_num}_{date}.lst` * `{success,failure}_` files are now only generated when needed. In addition, they are now suffixed with `.log`. Hopefully a bit easier to find and clean up. `migrate_node.py` now takes `--bso_users_file` which is either the name of a file that will be used for all BSOs, or a template that will be used to find the bso_users_file (e.g. if you specify `--bso_users_file=users/bso_users_#_2020_04_14.lst` and `--bso_start=1 --bso_end=3`, migrate user will pull from `users/bso_users_1_2020_04_14.lst` for users in BSO#1, `users/bso_users_2_2020_04_14.lst` for users in BSO#2, etc. NOTE: by default scripts will date stamp various cached files, ideally, we should take reasonably "fresh" ones to avoid potentially missing users that are suddenly added to nodes. This is not a requirement, and all scripts allow for a custom file name.	2020-04-14 16:12:56 -07:00
jrconlin	caddb661ed	f fix comment	2020-04-14 16:12:56 -07:00
jrconlin	18f7c22ae3	f address pjenvey's todo added user_collection.last_modified to bso data pull	2020-04-10 14:59:35 -07:00
jrconlin	0cb62b98ec	f pip8	2020-04-10 14:44:48 -07:00
jrconlin	a74ed7b6d2	f fetch count with users, kick hoarders early	2020-04-10 14:41:41 -07:00
Philip Jenvey	d6b2dc2187	fix: don't replace user_collections since bsos INTERLEAVE's w/ DELETE CASCADE - persist unique_key_filter across writes - fix new bundling of of bso_values for inserting bsos - add TODO for fixing user_collections' modified time	2020-04-09 18:25:37 -07:00
jrconlin	1adfb6449e	f break user percentage into it's own function	2020-04-09 14:07:27 -07:00
jrconlin	edd0017d2c	feat: latest ops requests * Add --hoard_limit to limit max number of records per user * add reason to `failure_*.csv`	2020-04-09 12:05:18 -07:00
jrconlin	b74e529231	f fix "helpful" argparse help string parsing. TLDR: Don't use a single %	2020-04-08 16:59:09 -07:00
jrconlin	f3d358caee	f general cleanup	2020-04-07 11:34:28 -07:00
jrconlin	84e1efbd27	f fix `--user` argument	2020-04-07 10:50:26 -07:00
jrconlin	4f9cb14b78	f add PID to `success_.csv` and `failure_.csv` files.	2020-04-07 08:43:33 -07:00
jrconlin	f9c1e5a532	f fix uid references, warning logic	2020-04-07 08:26:07 -07:00
jrconlin	d4a4ff885c	f convert k_c_a & generation to ints	2020-04-07 07:52:00 -07:00
jrconlin	2a6d5e28b4	f correct error reporting	2020-04-07 07:41:18 -07:00
jrconlin	00e67b4baf	f trap for "NULL" as client state	2020-04-06 19:28:28 -07:00
jrconlin	29185f28c3	f Dockerfile fix #4 add success / fail uid files.	2020-04-06 17:09:33 -07:00
jrconlin	99e152b5d8	f flake8 fixes	2020-04-06 16:02:28 -07:00
jrconlin	edca5ef0a5	f alter default anonymization * check for "NULL" client_state in user.csv and skip if need be.	2020-04-06 15:57:50 -07:00
jrconlin	3df4c34d87	f r's	2020-04-02 17:36:14 -07:00
jrconlin	55edc74ad7	f add `--ms_delay` flag. use `ms_delay` to pause between spanner transaction `--readchunk`s. This allows some primative throttling for feeding spanner data. Reminder: `readchunk` sets the max number of items to try to write per chunk to spanner in any given transaction, default value 1000.	2020-04-02 09:38:54 -07:00
jrconlin	08a646a36e	feat: add `--user_percent` option The `--user_percent` option will divvy up the users into blocks and move the specified block. It takes an option formatted as "block#:percentage". Block numbers are 1 based. For example, --user_percent=2:33 will divide the total distinct users into non-overlapping blocks of approximately 33%, and then move the second block (e.g. the 33-65th users in the list). Extra users that may not be evenly divided into percentage blocks will be appended to the last block. (e.g. for `--user_percent=3:33`, users 66-99 would be copied over, a total of 34 users) Issue #407	2020-04-02 09:38:54 -07:00
jrconlin	0a9cf9c650	f pjenvey fix	2020-03-18 15:02:33 -07:00
jrconlin	be3b18f879	f add rust_migration WIP * make user sorting optional * formatting tweaks to dump_mysql.py and sync.avsc	2020-03-17 16:53:07 -07:00
jrconlin	a65123bcf2	feat: Add `--abort` and `--user_range` flags * --abort stops copying BSO records after N instances. * --user_range limits copy to offset:limit users. * sorts users by fxa_uid	2020-03-16 13:32:38 -07:00
JR Conlin	ecfca9fdf5	feat: more user_migration stuff (#450 ) * feat: more user_migration stuff * create script to move users by node directly * moved old scripts to `old` directory (for historic reasons, as well as possible future use) * cleaned up README * try to solve the `parent row` error an intermittent error may be responsible from one of two things: 1) a transaction failure resulted in a premature add of the unique key to the UC filter. 2) an internal spanner update error resulting from trying to write the bso before the user_collection row was written. * Added "fix_collections.sql" script to update collections table to add well known collections for future rectification. * returned collection name lookup * add "--user" arg to set bso and user id * add `--dryrun` mode	2020-03-02 20:26:07 -08:00

30 Commits