...
- Go to ServiceNow, ideally not PROD instance, and login as admin user.
- Navigate to application filter (top left corner), type in sys_user.list to dump the user table.
- This will find potentially 200k or so entries, and you cannot export that many accounts.
- Define a filter, which shows you a subset of users. Here's a way: "NetID starts with b"
- Apply that filter, which limited me down to 7715 hits
- Right click on the results page, along the column headers.
- You will get a new menu, with an Export submenu
- Pick CSV out of that list
- Choose to download the export
How to
...
rebuild a user_sys locally
I had a collection of .csv files with fewer than 50k entries, but it's nicer to work with the full set when performing queries.
Code Block |
---|
zuse:Downloads db692$ head sys_user_from_testcat users_starts_numeric.csv users_starts_with_ba-c.csv "user_name","employee_number","name","title","source","u_facit","u_organization.u_department.u_division.u_id","u_organization.u_department.id","u_organization.u_id","active","mobile_phone","sys_created_on" "b1912a","10793883","Brian Marshall","","","false","","","","false","","2012-02-23 11:43:23" "ba2","10089420","Beatrice Abetti","","","false","D00609","G00806","872108","false","","2012-02-23 09:23:43" "ba22","10097002","Barbara Amato-Kuslan","","","false","D03474","G01951","960003","true","","2012-02-23 09:24:22" |
You want to look for the subset with colliding netids...
Code Block |
---|
zuse:Downloads db692$ cat sys_user_from_test_starts_with_b.csv | awk -F, '{print $1}' | uniq -d | wc users_starts_with_d-i.csv users_starts_with_j-l.csv users_starts_with_m-r.csv users_starts_with_s-z.csv > all_users.csv # catch the header lines, and delete all but the first one grep -n user_name all_users.csv sed "175824d;127985d;81587d;41395d;125d" all_users.csv > all_users_one_header.csv cut -d, -f1 all_users_one_header.csv > all_users_just_usernames.csv |
How to parse the export csv with a subset of the user table...
Code Block |
---|
zuse:Downloads db692$ head sys_user_from_test_starts_with_b.csv
"user_name","employee_number","name","title","source","u_facit","u_organization.u_department.u_division.u_id","u_organization.u_department.id","u_organization.u_id","active","mobile_phone","sys_created_on"
"b1912a","10793883","Brian Marshall","","","false","","","","false","","2012-02-23 11:43:23"
"ba2","10089420","Beatrice Abetti","","","false","D00609","G00806","872108","false","","2012-02-23 09:23:43"
"ba22","10097002","Barbara Amato-Kuslan","","","false","D03474","G01951","960003","true","","2012-02-23 09:24:22"
|
You want to look for the subset with colliding netids...
Code Block |
---|
zuse:Downloads db692$ cat sys_user_from_test_starts_with_b.csv | awk -F, '{print $1}' | uniq -d | wc
34 34 271
|
My concerns about this approach
...
Code Block |
---|
mysql> UPDATE uniq_between_both_sets, legit_netids SET source="AUTH" WHERE uniq_between_both_sets.netid = legit_netids.netid; |
Status
As of the end of June 1, 2012, Backeberg has isolated just under 1000 duplicate netid entries. It's fairly straightforward to wipe these out. I've built a script to do the wipeout; Bill has asked that I make a no-op version that collects a log, and run that in pre-prod. That's running June 12, 2012. June 15, 2012 I checked on the job; still running. Bill helped me find a bug in my transform (I didn't coalesce on common rows; he said he makes the same mistake occasionally). I tried stopping the job, but it wouldn't stop. Based on the burn rate, it should finish on its own Monday or Tuesday, by which time I'll probably be out on leave.
I've also approached the problem of finding the bad account entries in Service Now that have not YET collided with a legitimate account, but will in the future as new NetIDs are created. Said differently, these are the difference between the overall set of users in Service Now and the set of users in Central Auth. I've been using a collection of MySQL and command line tricks to tease out these entries, and I think my fifth round results are both accurate and my data analysis approach will get a result in a reasonable amount of time. Then we will probably put those results through the same scrutiny as my original set of duplicates data set.This completed in less time than I expected, and got interesting results...
Code Block |
---|
mysql> select DATE(netids_from_service_now.creation),count(1) FROM netids_from_service_now, uniq_between_both_sets where netids_from_service_now.netid = uniq_between_both_sets.netid GROUP BY DATE(netids_from_service_now.creation) WITH ROLLUP;
+----------------------------------------+----------+
| DATE(netids_from_service_now.creation) | count(1) |
+----------------------------------------+----------+
| 2004-05-01 | 1 |
| 2012-02-19 | 1 |
| 2012-02-22 | 15 |
| 2012-02-23 | 2266 |
| 2012-02-28 | 1 |
| 2012-02-29 | 12 |
| 2012-03-02 | 1 |
| 2012-03-07 | 2 |
| 2012-03-19 | 1 |
| 2012-03-22 | 3 |
| 2012-03-29 | 4 |
| 2012-03-30 | 1 |
| 2012-04-16 | 1 |
| 2012-04-18 | 22 |
| 2012-04-21 | 16 |
| 2012-04-22 | 3514 |
| 2012-04-24 | 1 |
| 2012-04-25 | 1 |
| 2012-04-26 | 15 |
| 2012-04-27 | 3 |
| 2012-05-01 | 1 |
| 2012-05-02 | 2 |
| 2012-05-03 | 2 |
| 2012-05-04 | 5 |
| 2012-05-07 | 1 |
| 2012-05-09 | 1 |
| 2012-05-14 | 1 |
| 2012-05-16 | 1 |
| 2012-05-17 | 1 |
| 2012-05-18 | 3 |
| 2012-05-22 | 14 |
| 2012-05-23 | 1 |
| 2012-05-24 | 2 |
| 2012-05-25 | 4 |
| 2012-05-28 | 4 |
| 2012-05-31 | 29 |
| NULL | 5953 |
+----------------------------------------+----------+
37 rows in set (1 min 24.52 sec)
|
The two big days; in February and April, I already knew about from intermediate spot checking. What I'm surprised about is how broad of a collection of hits I have on accounts that are in Service Now, but are not valid in Central Auth. These are small enough collections that I should be able to pick a few of them by hand and see if I can figure out what is going on.
Status
As of the end of June 1, 2012, Backeberg has isolated just under 1000 duplicate netid entries. It's fairly straightforward to wipe these out. I've built a script to do the wipeout; Bill has asked that I make a no-op version that collects a log, and run that in pre-prod. That's running June 12, 2012. June 15, 2012 I checked on the job; still running. Bill helped me find a bug in my transform (I didn't coalesce on common rows; he said he makes the same mistake occasionally). I tried stopping the job, but it wouldn't stop. Based on the burn rate, it should finish on its own Monday or Tuesday, by which time I'll probably be out on leave.
I've also approached the problem of finding the bad account entries in Service Now that have not YET collided with a legitimate account, but will in the future as new NetIDs are created. Said differently, these are the difference between the overall set of users in Service Now and the set of users in Central Auth. I've been using a collection of MySQL and command line tricks to tease out these entries, and I think my fifth round results are both accurate and my data analysis approach will get a result in a reasonable amount of time. Then we will probably put those results through the same scrutiny as my original set of duplicates data set.
The correct Service Now script to wipe bad user entries
Code Block |
---|
/**
* For variables go to: http://wiki.service-now.com/index.php?title=Import_Sets
**/
var netid = source.netid;
var rec = new GlideRecord('sys_user');
/*rec.addQuery('user_name','=',source.u_netid);*/
rec.addQuery('user_name',source.u_netid);
rec.addQuery('sys_created_on','<','2012-02-24 12:00:00');
rec.addInactiveQuery();
rec.query();
if (rec.next()){
rec.deleteRecord();
}
|