Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

Code Block
titlehttp://yalesandbox.service-now.com/nav_to.do?uri=ecc_queue.do?sys_id=e8966eead4a0200054d3e703207fe9e4
<notification><priority>1</priority><short_description>Load Average is CRITICAL on host meg.its.yale.edu</short_description><comments>Service: Load Average
Host: meg.its.yale.edu
Address: meg.its.yale.edu
State: CRITICAL
Date/Time: Tue Mar 6 14:23:59 EST 2012

Additional Info:

CRITICAL - load average: 13.19, 12.05, 10.57</comments><category>Network</category><checktime>1331061769</checktime><correlation_id>meg.its.yale.edu;Load Average;1331061048</correlation_id><state>CRITICAL</state><servicename>Load Average</servicename><hostname>meg.its.yale.edu</hostname><contact_type>Opsview</contact_type></notification>

ServiceNow-Side Configuration

Concerns

There is a lot of state change within Opsview. Sometimes this state change is considered spiurious upon human inspection while its always deemed a genuine issue per Opsview. Yale should tread carefully regarding opening this Opsview dataflow up to Service Now so as to avoid a firehose condition.

Moreover this integration should be used as a lightning rod to initiate and push for amending the monitoring stack where appropriate to dial back the amount of state change.
-nick, 20120305

  1. We can work on dialing down yelling by tracking Top Talkers
  2. Need to confirm that renotifies dont generate dupe incidents
  3. Vet process of auto-acking of states during incident open @ service now
  4. Nail process on how to target top talkers + optimize minimizing false poz where possible (timeout increases, consecutive hit increases, etc)

Tickets

Opsview

Service Now

  • inquire about least privilege needed for bind account
  • inquire about addresses to permit egress to from monitoring stations

...

  • Consider keywords for service now only ; not all events would enter SN only those we say to per tags

Final considerations

  • per Lou: Critical Alerts from Opsview should be set to a priority level of "High" in SN and Warnings be set as "Average".
  • field mappings: "CLIENT=OPsView, "CONTACT=Operator" "NOTIFY=NONE", "CONTACT TYPE=Tier 2", "Incident Type=Service Event", and Priority=3-High" on the SN screen.
  • left-nav item for unassigned opsview tickets (or bookmark/view) for ops staff
  • Downtimed state change => handled => open ticket?
  • Yellows/warnings + tickets as they effect workflow
    • OVS-3236
    • Assign warns to proper group ; assign crit to DC Ops
  • Acknowledgements
    • auto-ack @ Opsview and how this impacts Yale workflow
    • this means that Unhandled column in Opsview goes away, becomes meaningless, etc. Important if thats a catalyst for starting Ops work (procedure, call, etc).
  • Getting keyword data transmitted
  • Oddballs (non-ITS like YUHS)
    • Oddballs can stay behind if needed; they dont block ITS from moving. This is all driven with notification profiles where we can poz match keywords.