Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
sed "s/\ //g" valid-IST1-netids.csv > valid-IST1-netids_stripped_whitespace.csv
cat valid-IST1-netids_stripped_whitespace.csv | sed "2,50d"  > valid-IST1-netids_stripped_whitespace_start_with_a_through_z.csv 

I had to cleanup my dump from Service Now to get just the fields I wanted...

Code Block

cat a_through_z.csv | awk -F'","' '{print $1","$10",\""$12}' > a_through_z_netids_active_createdate_with_garbage.csv
cat a_through_z_netids_active_createdate_with_garbage.csv | sed 's/^"//g' > a_through_z_netids_active_createdate.csv
Code Block
mysql -uroot (no password initially).
CREATE DATABASE service_now_duplicates;
use service_now_duplicates;
create table legit_netids ( netid VARCHAR(15) );
create table netids_from_service_now ( netid VARCHAR(15), active BOOLEANactive ENUM('true','false'),creation DATETIME);
LOAD DATA INFILE '/Users/db692/service_now_duplicates/round_four/valid-IST1-netids_stripped_whitespace_start_with_a_through_z.csv' INTO TABLE service_now_duplicates.legit_netids;
LOAD DATA INFILE '/Users/db692/service_now_duplicates/round_four/a_through_z_netids_active_createdate.csv' INTO TABLE service_now_duplicates.netids_from_service_now FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"';
delete from netids_from_service_now WHERE netid="user_name";
delete from legit_netids WHERE netid="USERNAME";

Now we're loaded and ready to start doing some set logic.

Status

As of the end of June 1, 2012, Backeberg has isolated just under 1000 duplicate netid entries. It's fairly straightforward to wipe these out, but we're going to put that off until after I'm out of ServiceNow training next week. I've built a script to do the wipeout; Bill has asked that I make a no-op version that collects a log, and run that in pre-prod. That's running June 12, 2012.