vDAG Demo

🧠 vDAGs — Building Scalable, Distributed AI Workflows with Blocks

In modern AI systems, building powerful applications is no longer about deploying a single large model. Instead, it's about connecting many smaller, reusable, and scalable components — each handling a part of the task. This is where vDAGs, or virtual Directed Acyclic Graphs, step in as a game-changing abstraction for designing distributed AI workflows.

A vDAG represents a virtual workflow composed of interconnected “blocks”, where each block serves a specific AI or computational function. These blocks are created and deployed by developers on their own clusters using the AIOSv1 Instance SDK, and can scale independently based on demand.

What makes vDAGs powerful is that they allow you to build end-to-end applications using these distributed blocks, orchestrating them across a graph structure that can span within or across multiple clusters.

🔍 What Is a Block?

Before diving deeper into vDAGs, let’s clarify what a block is.

A block is the core serving component in the AIOSv1 ecosystem. It represents a self-contained unit responsible for:

Instantiating and serving AI models or general-purpose computation
Scaling based on load
Being managed dynamically across a distributed cluster environment

Blocks are deployed by users on any cluster that meets the resource requirements. Once deployed, they can serve:

As nodes in one or more vDAGs
Or as standalone inference endpoints outside any vDAG

You can read more about the block here.

🔗 What Is a vDAG?

A vDAG (virtual Directed Acyclic Graph) is a workflow composed of blocks, where each node in the graph is a block (or even another vDAG). It defines how data flows through a sequence of operations — such as preprocessing, model inference, post-processing, and so on — executed across the network of blocks.

The key word here is virtual. vDAGs don’t physically contain the logic — they refer to existing blocks deployed on clusters. Think of a vDAG as a blueprint or routing plan for how a particular task should be processed by different blocks across the network.

⚙️ Key Features of vDAG

✅ Modular Composition: Each node is a block that can be reused in multiple workflows or used standalone.
✅ Nested Graphs: Nodes in a vDAG can themselves reference other vDAGs (subgraphs).
✅ Cross-Cluster Execution: Nodes can reside on different clusters depending on where blocks are deployed.
✅ Assignment Policies: During vDAG creation, a policy can select the most suitable block from a pool of candidates for a given node.
✅ Custom Behavior: Each node supports pre-processing and post-processing policies, which run before or after the core block function — allowing for transformations, validation, routing logic, etc.
✅ Flexible Patterns: Supports fan-in/fan-out logic like multiple producers → single consumer, ensembles, and branching.

vDAGs End-to-End Workflow

This notebook demonstrates how to build and interact with vDAGs (virtual Directed Acyclic Graphs) — a powerful abstraction for building scalable, distributed AI pipelines.

We'll walk through:

✅ Creating a vDAG with multiple LLM blocks
🛠️ Deploying a Controller that manages routing, health, and policy execution
🚀 Submitting real inference requests using multimodal inputs
🧹 Cleaning up the controller after use

By the end of this notebook, you will have a working vDAG pipeline capable of handling image + text analysis for surveillance-style use cases.

🧱 Step 1: Register the vDAG

In this step, we define a virtual DAG that outlines how input data should flow through a series of AI blocks.

Each block in the vDAG performs a specific role:

gemma3-27b-block: Analyzes the input image and extracts semantic scene information (objects, interactions, environment).
llama4-scout-17b-block: Processes the extracted attributes to detect high-level events like theft or accidents.
magistral-small-2506-llama-cpp-block: Acts as a decision-making module to determine if the event requires escalation (e.g., alerts, notifications).

🔄 Flow:

To Know more about Functions (here)

You can read more about the vDAG spec (here).

Here is the python script that registers the vDAG, once the vDAG is created, a unique vDAG URI will be assigned to it which is a string obtained by combining vdagName and vdagVersion strings as follows: <vdagName>:<vdagVersion.version>-<vdagVersion.releaseTag>.

Pre-requisites: - User Should have knowledge on how to deploy blocks on the AIOS cluster. - User should have knwoledge on how to register policy

import requests

PARSER_URL = "http://MANAGEMENTMASTER:30501/api/createvDAG"

data = {
  "parser_version": "Parser/V1",
  "body": {
    "spec": {
      "values": {
        "vdagName": "llm-analyzer-aug6",
        "vdagVersion": {
          "version": "0.0.6",
          "release-tag": "stable"
        },
        "discoveryTags": [
          "vdag-llm",
          "llm-vdag"
        ],
        "controller": {},
        "nodes": [
          {
            "spec": {
              "values": {
                "nodeLabel": "gemma3-27b-block",
                "nodeType": "block",
                "manualBlockId": "gemma3-27b-block",
                "preprocessingPolicyRule": {},
                "postprocessingPolicyRule": {},
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [
                    {
                      "name": "input_0",
                      "reference": "input_0"
                    }
                  ],
                  "outputs": [
                    {
                      "name": "output_0",
                      "reference": "output_0"
                    }
                  ]
                }
              ]
            }
          },
          {
            "spec": {
              "values": {
                "nodeLabel": "llama4-scout-17b-block",
                "nodeType": "block",
                "manualBlockId": "llama4-scout-17b-block",
                "preprocessingPolicyRule": {},
                "postprocessingPolicyRule": {},
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [
                    {
                      "name": "input_0",
                      "reference": "input_0"
                    }
                  ],
                  "outputs": [
                    {
                      "name": "output_0",
                      "reference": "output_0"
                    }
                  ]
                }
              ]
            }
          },
          {
            "spec": {
              "values": {
                "nodeLabel": "magistral-small-2506-llama-cpp-block",
                "nodeType": "block",
                "manualBlockId": "magistral-small-2506-llama-cpp-block",
                "preprocessingPolicyRule": {},
                "postprocessingPolicyRule": {
                  "policyRuleURI": "post_processor_for_job_caller:0.0.1-stable"
                },
                "modelParameters": {}
              },
              "IOMap": [
                {
                  "inputs": [
                    {
                      "name": "input_0",
                      "reference": "input_0"
                    }
                  ],
                  "outputs": [
                    {
                      "name": "output_0",
                      "reference": "output_0"
                    }
                  ]
                }
              ]
            }
          }
        ],
        "graph": {
          "input": [
            {
              "nodeLabel": "gemma3-27b-block",
              "inputNames": [
                "input_0"
              ]
            }
          ],
          "output": [
            {
              "nodeLabel": "magistral-small-2506-llama-cpp-block",
              "outputNames": [
                "output_0"
              ]
            }
          ],
          "connections": [
            {
              "nodeLabel": "llama4-scout-17b-block",
              "inputs": [
                {
                  "nodeLabel": "gemma3-27b-block",
                  "outputNames": [
                    "output_0"
                  ]
                }
              ]
            },
            {
              "nodeLabel": "magistral-small-2506-llama-cpp-block",
              "inputs": [
                {
                  "nodeLabel": "llama4-scout-17b-block",
                  "outputNames": [
                    "output_0"
                  ]
                }
              ]
            }
          ]
        }
      }
    }
  }
}

response = requests.post(PARSER_URL, json=data)
print(response.status_code)
print('api response', response.json())

200
api response {'result': {'task_id': '3c5887c7-6544-43d1-92af-797a325af7c2', 'vdagURI': 'llm-analyzer-aug6:0.0.6-stable'}, 'success': True, 'task_id': ''}

You can confirm the creation of vDAG by querying the vDAG information using vdagURI from vDAGs registry:

%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag/llm-analyzer-aug6:0.0.6-stable | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2777  100  2777    0     0   4856      0 --:--:-- --:--:-- --:--:--  4854


{
   "data" : {
      "assignment_info" : {
         "gemma3-27b-block" : "gemma3-27b-block",
         "llama4-scout-17b-block" : "llama4-scout-17b-block",
         "magistral-small-2506-llama-cpp-block" : "magistral-small-2506-llama-cpp-block"
      },
      "compiled_graph_data" : {
         "head" : "gemma3-27b-block",
         "rev_mapping" : {
            "gemma3-27b-block" : "gemma3-27b-block",
            "llama4-scout-17b-block" : "llama4-scout-17b-block",
            "magistral-small-2506-llama-cpp-block" : "magistral-small-2506-llama-cpp-block"
         },
         "t2_graph" : {
            "gemma3-27b-block" : [
               "llama4-scout-17b-block"
            ],
            "llama4-scout-17b-block" : [
               "magistral-small-2506-llama-cpp-block"
            ],
            "magistral-small-2506-llama-cpp-block" : []
         },
         "t3_graph" : {
            "gemma3-27b-block" : {
               "outputs" : [
                  {
                     "block_id" : "llama4-scout-17b-block",
                     "host" : "llama4-scout-17b-block-executor-svc.blocks.svc.cluster.local",
                     "port" : 6379,
                     "queue_name" : "llama4-scout-17b-block_inputs"
                  }
               ]
            },
            "llama4-scout-17b-block" : {
               "outputs" : [
                  {
                     "block_id" : "magistral-small-2506-llama-cpp-block",
                     "host" : "magistral-small-2506-llama-cpp-block-executor-svc.blocks.svc.cluster.local",
                     "port" : 6379,
                     "queue_name" : "magistral-small-2506-llama-cpp-block_inputs"
                  }
               ]
            },
            "magistral-small-2506-llama-cpp-block" : {
               "outputs" : []
            }
         },
         "tail" : [
            "magistral-small-2506-llama-cpp-block"
         ]
      },
      "controller" : {
         "initParameters" : {},
         "initSettings" : {},
         "inputSources" : [],
         "policies" : []
      },
      "discoveryTags" : [
         "vdag-llm",
         "llm-vdag"
      ],
      "graph" : {
         "connections" : [
            {
               "inputs" : [
                  {
                     "nodeLabel" : "gemma3-27b-block",
                     "outputNames" : [
                        "output_0"
                     ]
                  }
               ],
               "nodeLabel" : "llama4-scout-17b-block"
            },
            {
               "inputs" : [
                  {
                     "nodeLabel" : "llama4-scout-17b-block",
                     "outputNames" : [
                        "output_0"
                     ]
                  }
               ],
               "nodeLabel" : "magistral-small-2506-llama-cpp-block"
            }
         ],
         "input" : [
            {
               "inputNames" : [
                  "input_0"
               ],
               "nodeLabel" : "gemma3-27b-block"
            }
         ],
         "output" : [
            {
               "nodeLabel" : "magistral-small-2506-llama-cpp-block",
               "outputNames" : [
                  "output_0"
               ]
            }
         ]
      },
      "metadata" : {},
      "nodes" : [
         {
            "IOMap" : [],
            "assignmentPolicyRule" : {},
            "inputProtocol" : {},
            "manualBlockId" : "gemma3-27b-block",
            "modelParameters" : {},
            "nodeLabel" : "gemma3-27b-block",
            "nodeType" : "block",
            "outputProtocol" : {},
            "postprocessingPolicyRule" : {},
            "preprocessingPolicyRule" : {},
            "vdagURI" : ""
         },
         {
            "IOMap" : [],
            "assignmentPolicyRule" : {},
            "inputProtocol" : {},
            "manualBlockId" : "llama4-scout-17b-block",
            "modelParameters" : {},
            "nodeLabel" : "llama4-scout-17b-block",
            "nodeType" : "block",
            "outputProtocol" : {},
            "postprocessingPolicyRule" : {},
            "preprocessingPolicyRule" : {},
            "vdagURI" : ""
         },
         {
            "IOMap" : [],
            "assignmentPolicyRule" : {},
            "inputProtocol" : {},
            "manualBlockId" : "magistral-small-2506-llama-cpp-block",
            "modelParameters" : {},
            "nodeLabel" : "magistral-small-2506-llama-cpp-block",
            "nodeType" : "block",
            "outputProtocol" : {},
            "postprocessingPolicyRule" : {
               "policyRuleURI" : "post_processor_for_job_caller:0.0.1-stable"
            },
            "preprocessingPolicyRule" : {},
            "vdagURI" : ""
         }
      ],
      "status" : "assigned",
      "vdagURI" : "llm-analyzer-aug6:0.0.6-stable",
      "vdag_name" : "llm-analyzer-aug6",
      "vdag_version" : {
         "release-tag" : "stable",
         "version" : "0.0.6"
      }
   },
   "success" : true
}

🧭 Step 2: Deploy a vDAG Controller

The vDAG Controller is the runtime engine that orchestrates the flow of data through the vDAG graph.

It handles: - Task routing between blocks - Health and quota monitoring - Quality management - using a policy to capture outputs and verifiying (manual/automated)

You can read about the vDAG controller here.

Configuration Parameters:

vdag_uri: Which vDAG this controller will serve
policy_execution_mode: Whether to run policies locally or remotely
replicas: How many controller pods to run (for redundancy or scale)

You can deploy multiple controllers for the same vDAG across clusters for multi-region or HA setups.

The command below deploys a vDAG controller for the vDAG we created with 2 replicas:

%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "create_controller",
    "payload": {
      "vdag_controller_id": "aug6-controller", 
      "vdag_uri": "llm-analyzer-aug6:0.0.6-stable",
      "config": {
        "policy_execution_mode": "local",
        "replicas": 2
      },
      "search_tags": []
    }
  }'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   336  100    58  100   278     91    437 --:--:-- --:--:-- --:--:--   529


{"data":"Controller created successfully","success":true}

We can query the available controllers for the given vDAG using the command below by specifying the vDAGURI

%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag-controllers/by-vdag-uri/llm-analyzer:0.0.3-stable | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1871  100  1871    0     0   3480      0 --:--:-- --:--:-- --:--:--  3484


{
   "data" : [
      {
         "cluster_id" : "gcp-cluster-2",
         "config" : {
            "api_url" : "http://CLUSTER1MASTER:32696",
            "policy_execution_mode" : "local",
            "replicas" : 1,
            "rest_url" : "http://CLUSTER1MASTER:31351",
            "rpc_url" : "CLUSTER1MASTER:30095"
         },
         "metadata" : {},
         "public_url" : "CLUSTER1MASTER:30095",
         "search_tags" : [
            "objedet",
            "narasimha",
            "prasanna"
         ],
         "vdag_controller_id" : "llm-004",
         "vdag_uri" : "llm-analyzer:0.0.3-stable"
      },
      {
         "cluster_id" : "gcp-cluster-2",
         "config" : {
            "api_url" : "http://CLUSTER1MASTER:32084",
            "policy_execution_mode" : "local",
            "replicas" : 1,
            "rest_url" : "http://CLUSTER1MASTER:30436",
            "rpc_url" : "CLUSTER1MASTER:30666"
         },
         "metadata" : {},
         "public_url" : "CLUSTER1MASTER:30666",
         "search_tags" : [
            "objedet",
            "narasimha",
            "prasanna"
         ],
         "vdag_controller_id" : "llm-vdag-controller-001",
         "vdag_uri" : "llm-analyzer:0.0.3-stable"
      },
      {
         "cluster_id" : "gcp-cluster-2",
         "config" : {
            "api_url" : "http://CLUSTER1MASTER:None",
            "policy_execution_mode" : "local",
            "replicas" : 1,
            "rest_url" : "http://CLUSTER1MASTER:None",
            "rpc_url" : "CLUSTER1MASTER:None"
         },
         "metadata" : {},
         "public_url" : "CLUSTER1MASTER:None",
         "search_tags" : [
            "objedet",
            "narasimha",
            "prasanna"
         ],
         "vdag_controller_id" : "llm-vdag-controller-001",
         "vdag_uri" : "llm-analyzer:0.0.3-stable"
      },
      {
         "cluster_id" : "gcp-cluster-2",
         "config" : {
            "api_url" : "http://CLUSTER1MASTER:32076",
            "custom_data" : {
               "qualityChecker" : {
                  "framesInterval" : 1,
                  "qualityCheckerPolicyRule" : {
                     "parameters" : {
                        "db_url" : "redis://POLICYSTORESERVER:6379/0"
                     },
                     "policyRuleURI" : "quality-checker:2.0-stable"
                  }
               },
               "quotaChecker" : {
                  "quotaCheckerPolicyRule" : {
                     "parameters" : {
                        "default_limit" : 1,
                        "whitelist" : [
                           "session10"
                        ]
                     },
                     "policyRuleURI" : "quota-checker:2.0-stable"
                  }
               }
            },
            "policy_execution_mode" : "local",
            "replicas" : 1,
            "rest_url" : "http://CLUSTER1MASTER:32638",
            "rpc_url" : "CLUSTER1MASTER:30393"
         },
         "metadata" : {},
         "public_url" : "CLUSTER1MASTER:30393",
         "search_tags" : [
            "objedet",
            "narasimha",
            "prasanna"
         ],
         "vdag_controller_id" : "policies-test-c",
         "vdag_uri" : "llm-analyzer:0.0.3-stable"
      }
   ],
   "success" : true
}

The individual controller details can also be queried by specifying the vdag_controller_id

%%bash
curl -X GET http://MANAGEMENTMASTER:30103/vdag-controller/aug6-controller | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   396  100   396    0     0    756      0 --:--:-- --:--:-- --:--:--   757


{
   "data" : {
      "cluster_id" : "gcp-cluster-2",
      "config" : {
         "api_url" : "http://CLUSTER1MASTER:31096",
         "policy_execution_mode" : "local",
         "replicas" : 2,
         "rest_url" : "http://CLUSTER1MASTER:32579",
         "rpc_url" : "CLUSTER1MASTER:32693"
      },
      "metadata" : {},
      "public_url" : "CLUSTER1MASTER:32693",
      "search_tags" : [
         "vdag-llm",
         "llm-vdag"
      ],
      "vdag_controller_id" : "aug6-controller",
      "vdag_uri" : "llm-analyzer-aug6:0.0.6-stable"
   },
   "success" : true
}

Every vDAG controller exposes REST and gRPC APIs for submitting inference requests and a REST service for management of health checker, quality checker and quota management policies. config.api_url can be used to submit inference requests using REST API and config.rpc_url can be used for submitting inference requests using GRPC interface.

🤖 Step 3: Run Inference on the vDAG

We now simulate an inference task using a multi-modal input — both text and image.

Input Structure:

session_id: Used for tracking and quota enforcement
seq_no: Monotonically increasing number per session
data.mode: Set to "chat" to enable conversational behavior
messages: The core input — includes both a text prompt and an image URL

What Happens:

The request is received by the controller
It routes the input to gemma3-27b-block for vision analysis
Then to llama4-scout-17b-block for event detection
Finally to magistral-small-2506-llama-cpp-block for decision making and alert triggering

The final output will be a structured scene description and alert status.

Inference using REST API

Inference requests can be submitted using the REST API using the command below (the REST API url can be obtained from config.api_url field of vDAG controller data):

%%bash
curl -X POST  http://CLUSTER1MASTER:31096/v1/infer \
  -H "Content-Type: application/json" \
  -d '{
  "session_id": "session1",
  "seq_no": 5,
  "data": {
    "mode": "chat",
    "gen_params": {
      "temperature": 0.1,
      "top_p": 0.95,
      "max_tokens": 4096
    },
    "messages": [
      {
        "content": [
          {
            "type": "text",
            "text": "Analyze the following image and generate your objective scene report.?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://akm-img-a-in.tosshub.com/indiatoday/images/story/202311/chain-snatching-caught-on-camera-in-bengaluru-293151697-16x9_0.jpg"
            }
          }
        ]
      }
    ]
  },
  "graph": {},
  "selection_query": {}
}' | json_pp

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6239  100  5549  100   690     36      4  0:02:52  0:02:32  0:00:20  1329


{
   "data" : {
      "reply" : "### **Event Analysis**\n\n**Classification:** `Alertable`\n\n**Summary:** The event involves a person lying on the ground with a man assisting them, possibly due to an accident or incident. The presence of standing water on the road may have contributed to the situation. Further investigation is recommended to determine the exact nature of the event.\n\n---\n\n### **A. Python Policy Script (`alerter.py`)**\n\n```python\nimport logging\nimport requests\n\nclass AIOSv1PolicyRule:\n    def __init__(self, rule_id, settings, parameters):\n        self.rule_id = rule_id\n        self.settings = settings\n        self.parameters = parameters\n\n    def eval(self, parameters, input_data, context):\n        try:\n            summary = input_data.get(\"summary\", \"\")\n            if not summary:\n                return\n\n            destination_url = self.settings.BASE_URI\n            payload = {\n                \"source_rule_id\": self.rule_id,\n                \"message_content\": summary\n            }\n\n            response = requests.post(destination_url, json=payload)\n            result = [{\n                \"api_status_code\": response.status_code,\n                \"api_response_body\": response.json()\n            }]\n\n            return {\n                \"result\": result,\n                \"input_data\": input_data,\n                \"reason\": \"Success\"\n            }\n        except requests.exceptions.RequestException as e:\n            return {\n                \"result\": [],\n                \"input_data\": input_data,\n                \"reason\": f\"Request failed: {str(e)}\"\n            }\n        except Exception as e:\n            return {\n                \"result\": [],\n                \"input_data\": input_data,\n                \"reason\": f\"Error: {str(e)}\"\n            }\n```\n\n---\n\n### **B. JSON Registration File (`registration.json`)**\n\n```json\n{\n    \"name\": \"alerter\",\n    \"version\": \"1.0.0\",\n    \"release_tag\": \"stable\",\n    \"metadata\": {\n      \"author_name\": \"AI Analyst\",\n      \"author_email\": \"analyst@example.com\",\n      \"organization\": \"my.org\",\n      \"country\": \"US\",\n      \"license\": \"MIT\",\n      \"category\": \"Alert\",\n      \"use_case\": \"Use the policy to push the Alert to 3rd party endpoint\",\n      \"geographic_scope\": \"Global\",\n      \"audience\": [\"vdag_users\"],\n      \"integration_notes\": \"\",\n      \"tested_environments\": [],\n      \"execution_environment\": \"Python 3.10+\",\n      \"compliance_tags\": [\"ISO/IEC 27001\"]\n    },\n    \"tags\": \"alerting, webhook\",\n    \"code\": \"http://YOUR_CONTENT_STORE_IP:PORT/alerter.zip\",\n    \"code_type\": \"zip\",\n    \"type\": \"policy\",\n    \"policy_input_schema\": {\n        \"summary\": {\n            \"type\": \"string\",\n            \"description\": \"The summary message to be sent to the 3rd party service.\"\n        }\n    },\n    \"policy_output_schema\": {\n        \"result\": {\n            \"type\": \"array\",\n            \"description\": \"An array containing the status code and response body from the API call.\"\n        },\n        \"input_data\": {\n            \"type\": \"object\",\n            \"description\": \"The original input data passed to the policy.\"\n        },\n        \"reason\": {\n            \"type\": \"string\",\n            \"description\": \"Reason for the decision (e.g., 'Success' or an error message).\"\n        }\n    },\n    \"policy_settings_schema\": {\n        \"BASE_URI\": {\n            \"type\": \"string\",\n            \"description\": \"The destination URL for the HTTP POST request.\"\n        }\n    },\n    \"policy_parameters_schema\": {},\n    \"policy_settings\": {\n        \"BASE_URI\": \"http://3rdpartyserviceIP:port/receive\"\n    },\n    \"policy_parameters\": {},\n    \"management_commands_schema\": [],\n    \"description\": \"A policy to push alert messages to a third-party service via HTTP POST.\",\n    \"functionality_data\": {},\n    \"resource_estimates\": {}\n}\n```\n\n---\n\n### **C. Bash Pusher Script (`pusher.sh`)**\n\n```bash\n#!/bin/bash\n\n# The path to the zip file you want to upload\nFILE_PATH=\"alerter.zip\"\n\n# The destination URL for the content store\nENDPOINT=\"http://YOUR_CONTENT_STORE_IP:PORT/upload\"\n\necho \"Uploading $FILE_PATH to $ENDPOINT...\"\n# Use curl to post the file\ncurl -X POST -F \"file=@$FILE_PATH\" $ENDPOINT\n```\n\n---\n\n### **D. Bash Registration Script (`register.sh`)**\n\n```bash\n#!/bin/bash\n\n# The endpoint for the policy registry\nENDPOINT=\"http://YOUR_REGISTRY_IP:PORT/policy\"\n\necho \"Registering policy using registration.json...\"\n# Use curl to post the registration file\ncurl -X POST -H \"Content-Type: application/json\" --data @./registration.json $ENDPOINT\n```\n\n---\n\n### **E. Bash Deployment Script (`deploy_job.sh`)**\n\n```bash\n#!/bin/bash\n\n# The endpoint for the job submission API\nENDPOINT=\"http://YOUR_JOB_API_IP:PORT/jobs/submit/executor-001\"\n\necho \"Submitting job for alerter policy...\"\n# Use curl to post the job submission\ncurl -X POST $ENDPOINT \\\n -H \"Content-Type: application/json\" \\\n -d '{\n    \"name\": \"alerter-critical-node-job\",\n    \"policy_rule_uri\": \"alerter:1.0.0-stable\",\n    \"policy_rule_parameters\": {},\n    \"node_selector\": {},\n    \"inputs\": {\n      \"summary\": \"Critical alert: High CPU usage detected on node-123.\"\n    }\n  }'\n```"
   },
   "seq_no" : 5,
   "session_id" : "session1",
   "ts" : 1754466151.29808
}

Inference using gRPC API

vDAG controller provides a well defined gRPC interface and a structured protobuf definition for submitting inference requests:

syntax = "proto3";

message vDAGFileInfo {
    string metadata = 1; 
    bytes file_data = 2; 
}

// Definition the message structure
message vDAGInferencePacket {
    string session_id = 3;      
    uint64 seq_no = 4;           
    bytes frame_ptr = 5;       
    string data = 6;             
    double ts = 8;
    repeated vDAGFileInfo files = 9;
}

// Definition the gRPC service
service vDAGInferenceService {
    rpc infer(vDAGInferencePacket) returns (vDAGInferencePacket);
}

The following proto file can be compiled and imported in any programming language to integrate vDAG controller into your application. Here is the sample python file which submits inference task and waits for results using gRPC API.

import grpc
import time
import logging
from pathlib import Path
from uuid import uuid4
import json
from concurrent.futures import ThreadPoolExecutor

from proto.vdag_service_pb2 import vDAGInferencePacket, vDAGFileInfo
from proto.vdag_service_pb2_grpc import vDAGInferenceServiceStub

# Configure logging
logging.basicConfig(level=logging.INFO)


def load_file_info(file_path: str, metadata: str = "") -> vDAGFileInfo:
    with open(file_path, "rb") as f:
        file_data = f.read()
    return vDAGFileInfo(metadata=metadata, file_data=file_data)


def send_inference_request(seq_no: int):
    channel = grpc.insecure_channel("CLUSTER1MASTER:32409")
    stub = vDAGInferenceServiceStub(channel)

    data = {
        "mode": "chat",
        "gen_params": {
            "temperature": 0.1,
            # "min_p": 0.01,
            # "top_k": 64,
            "top_p": 0.95,
            "max_tokens": 4096  # Set a limit for the response length
        },
        "messages": [{"content": [
            {"type": "text", "text": "Analyze the following image and generate your objective scene report.?"},
            {"type": "image_url",
             "image_url": {"url": "https://akm-img-a-in.tosshub.com/indiatoday/images/story/202311/chain-snatching-caught-on-camera-in-bengaluru-293151697-16x9_0.jpg"}}]}]
    }
    ts = time.time()

    # Optional file
    example_file = Path("example.txt")
    if example_file.exists():
        files = [load_file_info(str(example_file), metadata=f"seq_{seq_no}")]
    else:
        files = []

    session_id = str(uuid4())

    request = vDAGInferencePacket(
        session_id=session_id,
        seq_no=seq_no,
        frame_ptr=b"",
        data=json.dumps(data),
        ts=ts,
        files=files
    )

    try:
        logging.info(f"[seq_no={seq_no}] Sending request")
        st = time.time()
        response = stub.infer(request)
        et = time.time()
        logging.info(
            f"[seq_no={seq_no}] Response: data={response.data}, latency={et - st:.3f}s")
    except grpc.RpcError as e:
        logging.error(
            f"[seq_no={seq_no}] gRPC error: {e.code()} - {e.details()}")


send_inference_request(10)

🧹 Step 4: Clean-up

The controller can be removed using the following command

%%bash
curl -X POST http://MANAGEMENTMASTER:30600/vdag-controller/gcp-cluster-2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "remove_controller",
    "payload": {
      "vdag_controller_id": "aug6-controller"
    }
  }'

The vDAG entry if not needed anymore can be removed using the following command:

%%bash
curl -X DELETE http://MANAGEMENTMASTER:30103/vdag/llm-analyzer-agu6:0.0.6-stable

Testing the live demo using Streamlit dashboard

We have created a Stream-lit dashboard that provides a live real time interaction using chat interface.

pip install streamlit

Run the streamlit demonstration streamlit run app_vdag_final_working.py