Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

monitoring 4.1 Sensors wish list

Steve Jones edited this page Sep 7, 2017 · 1 revision
  1. Timing data for each phase of run-instance path

  2. Java Common

    1. Threads per thread-pool
    2. lock count
    3. db connection count
    4. Timing for db queries
    5. Per-service API call failure count
  3. time for Describe* calls to each cluster from CLC

    1. DescribeServices
    2. DescribeInstances
    3. DescribeResources
    4. DescribeSensors
  4. NC

    1. Network usage (not-euca specific)
    2. VM migrations incoming
    3. VM migrations outgoing
    4. space left in blob-store
    5. #of cores used, available
    6. RAM used, available
    7. Monitoring thread execution time (e.g. is it taking longer and longer or constant)
  5. SC

    1. Snapshot uploads in progress

    2. Bandwidth per snap

    3. Aggregate bandwidth

    4. Concurrent volume operations

    5. Connectivity status to backend

    6. Successful pings & failed pings

  6. Run-Instance timing

    1. Synchronous path
    2. Async full path (pending→running)
  7. CloudWatch

    1. Queue depth for data processing queues

    2. Incoming metrics per time unit

    3. Processed metrics per time unit (to detect dropped metrics)

    4. Alarms

    5. Number evaluated per minute

    6. Number transitioned per minute

    7. Total number of data points in the system

  8. AutoScaling

    1. Number of scaling groups
    2. Scaling actions taken
  9. ELB

    1. backend service pings succeeded & failed
    2. event listeners fired (e.g. vm failure detected and removed from rotation??)
  10. VPC/Networking

    1. Public IPs in system
    2. Public IPs in use
    3. VPC count
    4. Subnet count
    5. midonet API calls failed
    6. midonet API calls succeeded

tag:confluence tag:rls-4.1 tag:monitoring




Clone this wiki locally