Additional Tools

There are two ways to get data for simulation, if you would like to use values that are different than the actual values recorded in the metrics store. generate-data allows you to generate values as a function of pre-existing metrics or from a random distribution, and SignalFX scraper allows you to query values from SignalFX.

generate-data

The clusterman generate-data command is a helper function for the clusterman simulator to generate “fake” data, either as some function of pre-existing metric data or as drawn from a specified random distribution. The command takes as input an experimental design YAML file, and produces as output a compressed JSON file that can be directly used in a simulation.

Note

If the output file already exists, new generated metrics will be appended to it; existing metrics in the output file that share the same name as generated metrics will be overwritten, pending user confirmation

Experimental Design File Specification

An experimental design file contains details for how to generate experimental metric data for use in a simulation. The specification for the experimental design is as follows:

metric_type:
    metric_name:
        start_time: <date-time string>
        end_time: <date-time string>
        frequency: <frequency specification>
        values: <values specification>
        dict_keys: (optional) <list of dictionary keys>
  • The metric_type should be one of the Metric Types. There should be one section containing all the applicable metric names for each type.

  • Each metric_name is arbitrary; it should correspond to a metric value that clusterman simulate will use when performing its simulation. Multiple metrics can be specified for a given experimental design by repeating the above block in the YAML file for each desired metric; note that if multiple metrics should follow the same data generation specification, YAML anchors and references can be used.

  • The <date-time string> fields can be in a wide variety of different formats, both relative and exact. In most cases dates and times should be specifed in ISO-8601 format; for example, 2017-08-03T18:08:44+00:00. However, in some cases it may be useful to specify relative times; these can be in human-readable format, for example one month ago or -12h.

  • The <frequency specification> can take one of three formats:

    • Historical data: To generate values from historical values, specify historical here and follow the specification for historical values below.

    • Random data: if values will be randomly generated, then the frequency can be in one of two formats:

      • Regular intervals: by providing an <date-time string> for the frequency specification, metric values will be generated periodically; for example, a frequency of 1m will generate a new data point every minute.

      • Random intervals: to generate new metric event arrival times randomly, specify a <random generator> block for the frequency, as shown below:

        distribution: dist-function
        params:
            dist_param_a: param-value
            dist_param_b: param-value
        

        The dist-function should be the name of a function in the Python random module. The params are the keyword arguments for the chosen function. All parameter values relating to time should be defined in seconds; for example, if gauss is chosen for the distribution function, the units for the mean and standard deviation should be seconds.

Note

A common choice for the dist-function is expovariate, which creates an exponentially-distributed interarrival time, a.k.a, a Poisson process. This is a good baseline model for the arrival times of real-world data.

  • Similarly, the <values specification> can take one of two formats:

    • Function of historical data: historical values can be linearly transformed by \(ax+b\). Specify the following block:

      aws_region: <AWS region to read historical data from>
      params:
          a: <value>
          b: <value>
      
    • Random values: for this mode, specify a <random generator> block as shown above for frequency.

  • The dict_keys field takes a list of strings which are used to generate a single timeseries with (potentially) multiple data points per time value. For example, given the following dict_keys configuration:

    metric_a:
        dict_keys:
            - key1
            - key2
            - key3
    

    the resulting generated data for metric_a might look something like the example in Optional Multi-valued Timeseries Data format.

Output Format

The generate-data command produces a compressed JSON containing the generated metric data. The format for this file is identical to the simulator’s Experimental Input Data format.

Sample Usage

drmorr ~ > clusterman generate-data --input design.yaml --ouput metrics.json.gz
Random Seed: 12345678

drmorr ~ > clusterman simulate --metrics-data-file metrics.json.gz \
> --start-time "2017-08-01T08:00:00+00:00" --end-time "2017-08-01T08:10:00+00:00"

=== Event 0 -- 2017-08-01T08:00:00+00:00        [Simulation begins]
=== Event 2 -- 2017-08-01T08:00:00+00:00        [SpotPriceChangeEvent]
=== Event 28 -- 2017-08-01T08:00:00+00:00       [SpotPriceChangeEvent]
=== Event 21 -- 2017-08-01T08:00:00+00:00       [SpotPriceChangeEvent]
=== Event 22 -- 2017-08-01T08:02:50+00:00       [SpotPriceChangeEvent]
=== Event 3 -- 2017-08-01T08:05:14+00:00        [SpotPriceChangeEvent]
=== Event 23 -- 2017-08-01T08:06:04+00:00       [SpotPriceChangeEvent]
=== Event 0 -- 2017-08-01T08:00:00+00:00        [Simulation ends]

Sample Experimental Design File

metadata:
    spot_prices|aws_availability_zone=us-west-2a,aws_instance_type=c3.8xlarge: &spot_prices

        # If no timezone is specified, generator will use YST
        start_time: "2017-12-01T08:00:00Z"
        end_time: "2017-12-01T09:00:00Z"

        frequency:
            distribution: expovariate
            params:
                lambd: 0.0033333   # Assume prices change on average every five minutes

        values:
            distribution: uniform
            params:
                a: 0
                b: 1

    spot_prices|aws_availability_zone=us-west-2b,aws_instance_type=c3.8xlarge: *spot_prices
    spot_prices|aws_availability_zone=us-west-2c,aws_instance_type=c3.8xlarge: *spot_prices

    capacity|cluster=norcal-prod,role=seagull:
        start_time: "2017-12-01T08:00:00Z"
        end_time: "2017-12-01T09:00:00Z"

        dict_keys:
            - c3.8xlarge,us-west-2a
            - c3.8xlarge,us-west-2b
            - c3.8xlarge,us-west-2c

        frequency:
            distribution: expovariate
            params:
                lambd: 0.001666   # Assume capacity change on average every ten minutes

        values:
            distribution: randint
            params:
                a: 10
                b: 50

app_metrics:
    seagull_runs:
        start_time: "2017-12-01T08:00:00Z"
        end_time: "2017-12-01T09:00:00Z"
        frequency:
            distribution: expovariate
            params:
                lambd: 0.0041666 # 15 seagull runs per hour
        values: 1


system_metrics:
    cpu_allocation|cluster=everywhere-testopia,role=jolt:
        start_time: "2017-12-01T08:00:00Z"
        end_time: "2017-12-01T09:00:00Z"
        frequency: historical
        values:
            aws_region: "us-west-2"
            params:   # calculate value by a*x + b
                a: 1.5
                b: 10

The above design file, and a sample output file are located in docs/examples/design.yaml and docs/examples/metrics.json.gz, respectively.

SignalFX scraper

A tool for downloading data points from SignalFX and saving them in the compressed JSON format that the Clusterman simulator can use. This is an alternative to generating data if the data you’re interested in is in SignalFX, but it’s not yet in Clusterman metrics.

Note

Only data from the last month is available from SignalFX.

The tool will interactively ask you the metric type to save each metric as.

Traceback (most recent call last):
  File "/home/docs/.pyenv/versions/3.7.9/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/docs/.pyenv/versions/3.7.9/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/docs/checkouts/readthedocs.org/user_builds/clusterman/checkouts/stable/clusterman/tools/signalfx_scraper.py", line 17, in <module>
    from clusterman_metrics.util.constants import METRIC_TYPES
ModuleNotFoundError: No module named 'clusterman_metrics'

Sample usage:

python -m clusterman.tools.signalfx_scraper --start-time 2017-12-03 --end-time 2017-12-10 \
  --src-metric-names 'seagull.fleet_miser.cluster_capacity_units' --dest-file capacity \
  --api-token <secret> --filter rollup:max region:uswest2-testopia cluster_name:releng