An API inference is characterized by a set of input parameters sent to a base model, possibly fine-tuned, running on cloud or proprietary hardware, powered by energy from a grid region, at a particular time, and generating outputs like text, images, audio, and video. The Scope3 AI API uses this information to model the energy use, CO2, and water impact of an inference.

Simple Example

Here’s a simple example of a text generation inference. To try this, POST it to the API endpoint https://aiapi.scope3.com/impact or use our live API reference. You can also use the Google Sheets Integration to play with the API.

{
  "rows": [
    {
      "model": {
        "family": "claude (new)",
        "name": "3.5 Sonnet"
      },
      "hosting": {
        "service": "aws-bedrock"
      },
      "task": "text-generation",
      "input_tokens": 100,
      "output_tokens": 500
    }
  ]
}

Model Parameters

Models may not accept all of these inputs, but the API is aware of:

  • Prompt Tokens: Text input length measured in tokens
  • Images: Dimensions in pixels (e.g., 1024x1024), number of diffusion steps
  • Audio: Duration and sample rate

Example for image generation:

{
    {
      "input_tokens": 50,
      "input_steps": 30,
    }
}

Example for audio processing:

{
    {
      "input_audio_duration_s": 240
    }
}

Base Model

A base model can be specified in four ways. To see the list of models, you can query the List Model API.

1. By Model Family and Name

{
  "model": {
    "family": "gpt",
    "name": "GPT-4 Turbo Preview"
  }
}

2. By Model ID

{
  "model": {
    "id": "llama_31_8b"
  }
}

3. By Hugging Face Path

{
  "model": {
    "hugging_face_path": "microsoft/OmniParser"
  }
}

Fine-tuning

Not supported yet

Hardware

1. Managed Service Provider

{
  "hosting": {
    "service": "aws-bedrock"
  }
}

2. Cloud Instance

{
  "hosting": {
    "cloud": "aws",
    "instance": "p4d.24xlarge",
    "utilization_pct": 0.8
  }
}

3. Dedicated Hardware Node

{
  "hosting": {
    "country": "US",
    "region": "NY",
    "gpu": "a100",
    "gpu_count": 8,
    "cpu_count": 2,
    "idle_power_w_ex_gpu": 100,
    "average_utilization_rate": 0.8,
    "embodied_emissions_kgco2e_ex_gpu": 2500
  }
}

Task Types

Language Models

  • Text generation
  • Chat
  • Embeddings
  • Classification
  • Summarization
  • Translation

Example of a chat completion:

{
  "rows": [
    {
      "model": {
        "family": "gpt",
        "model": "GPT-4"
      },
      "task": "chat",
      "input_tokens": 150,
      "output_tokens": 300
    }
  ]
}

Computer Vision

  • Image classification
  • Object detection
  • Image generation
  • Style transfer
  • Upscaling

Example of text-to-image generation:

{
  "rows": [
    {
      "model": {
        "family": "openai",
        "name": "DALL-E 3"
      },
      "hosting": {
        "service": "azure-ml"
      },
      "task": "text-to-image",
      "input_tokens": 50,
      "output_images": ["1024x1024", "1024x1024"],
      "input_steps": 50
    }
  ]
}

Audio/Speech

  • Speech-to-text
  • Text-to-speech
  • Audio classification

Example of speech-to-text:

{
  "rows": [
    {
      "model": {
        "family": "openai",
        "name": "whisper-1"
      },
      "task": "speech-to-text",
      "input_audio_duration_s": 240
    }
  ]
}

Video Processing and Generation

  • Text-to-video generation
  • Video-to-video transformation
  • Video upscaling
  • Frame interpolation
  • Video editing

Example of text-to-video generation:

{
  "rows": [
    {
      "model": {
        "family": "stable-diffusion",
        "name": "Stable Video XL"
      },
      "hosting": {
        "cloud": "gcp",
        "instance": "a2-highgpu-8g"
      },
      "task": "text-to-video",
      "input_tokens": 75,
      "output_video_frames": 90,
      "output_video_resolution": 1080,
      "input_steps": 40
    }
  ]
}

API Response

The API returns impact metrics for each inference:

{
  "rows": [
    {
      "inference_impact": {
        "usage_energy_wh": 0.13,
        "usage_emissions_gco2e": 0.81,
        "usage_water_ml": 1.32,
        "embodied_emissions_gco2e": 0.81,
        "embodied_water_ml": 1.32
      },
      "training_impact": {
        "usage_energy_wh": 0.13,
        "usage_emissions_gco2e": 0.81,
        "usage_water_ml": 1.32,
        "embodied_emissions_gco2e": 0.81,
        "embodied_water_ml": 1.32
      },
      "total_impact": {
        "usage_energy_wh": 0.26,
        "usage_emissions_gco2e": 1.62,
        "usage_water_ml": 2.64,
        "embodied_emissions_gco2e": 1.62,
        "embodied_water_ml": 2.64
      }
    }
  ],
  "total_energy_wh": 0.26,
  "total_gco2e": 1.62,
  "total_mlh2o": 2.64
}

Complete Example

Here’s a complete example showing multiple inferences:

{
  "rows": [
    {
      "requestTime": "2024-01-01T12:00:00Z",
      "model": {
        "family": "claude",
        "name": "3.5 Sonnet"
      },
      "hosting": {
        "service": "aws-bedrock"
      },
      "task": "text-generation",
      "input_tokens": 100,
      "output_tokens": 500
    },
    {
      "requestTime": "2024-01-01T12:01:00Z",
      "model": {
        "family": "stable-diffusion",
        "model": "Stable Diffusion XL"
      },
      "hosting": {
        "cloud": "aws",
        "instance": "p4d.24xlarge",
        "utilization_pct": 0.8
      },
      "task": "text-to-image",
      "input_tokens": 50,
      "output_images": ["1024x1024"]
    }
  ]
}

Error Handling

TODO: Add error handling

Common error codes:

  • 401: Unauthorized - Invalid or missing API key
  • 403: Forbidden - API key doesn’t have access to requested resource
  • 406: Not acceptable - Invalid request format
  • 415: Unsupported media type
  • 429: Too many requests - Rate limit exceeded