Vasu Menon

AI-Powered Project Tracking for Hitachi Energy

This project was completed as my senior capstone at NC State University in partnership with Hitachi Energy, as part of the NCSU CSC Senior Design Program (Spring 2026).


Overview

Hitachi Energy’s teams managing the traction transformer program track project data across a fragmented landscape of spreadsheets, SharePoint folders, and other disparate sources. There was no automated way to consolidate this data or surface risks, schedule slippages, and resource conflicts to the right people in an efficient and timely way.

The goal of this project was to build an end-to-end BI pipeline that ingests project tracking data from across these sources daily, runs AI analysis over the combined dataset, and delivers persona-tailored reports and email digests to different stakeholder groups, all without any manual intervention.

The system runs as a scheduled Azure Container App Job on a cron schedule. Each run syncs fresh data from SharePoint, generates insights using a GPT model, builds interactive HTML reports, and dispatches email digests to configured recipients. This is entirely hands-free.


Pipeline

flowchart TD
    subgraph ingest["Ingestion"]
        SP["SharePoint / OneDrive"]
        B1[("bi-data")]
    end
    subgraph ai["AI Analysis"]
        IA["Insight Agent\ngpt-5.4-mini"]
        TL[/"filter · stats · cross_ref · search"/]
        CP["Chart Planner\ngpt-5.4-mini"]
    end
    subgraph deliver["Delivery"]
        RPT["HTML Report · Plotly · Matplotlib"]
        B2[("bi-reports")]
        EM["Email Dispatch\nAzure Comms"]
    end

    SP -->|"Graph API"| B1
    B1 --> IA
    IA <-->|"tool calls"| TL
    IA --> CP
    CP --> RPT
    RPT --> B2
    B2 -.->|"report link"| EM
    IA --> EM
    EM --> PM(["Project Managers"])
    EM --> EN(["Engineers"])
    EM --> RD(["R&D"])
    EM --> OP(["Operations"])

Each scheduled run executes the following steps:

  1. SharePoint sync: the pipeline authenticates against the Microsoft Graph API and downloads all CSV and Excel files from a configured OneDrive/SharePoint folder into Azure Blob Storage.

  2. Data ingestion: CSV files are downloaded from the bi-data blob container and loaded into Pandas DataFrames. The pipeline raises an error early if no data is found.

  3. Insight generation: a LangChain agent backed by Azure AI Foundry (GPT-5.4-mini) is given a detailed system prompt and a suite of data tools. It calls tools like filter_rows, calculate_column_statistics, cross_reference, and search_text to explore the data before synthesizing prioritized, evidence-backed insights. The agent is retried automatically on Azure OpenAI 429 rate-limit errors, respecting the Retry-After header.

  4. Chart planning: a second agent call produces a structured chart plan describing which visualizations best support the insights for each persona.

  5. HTML report generation: the chart plan is executed to build interactive Plotly and Matplotlib charts, assembled into a self-contained HTML report and uploaded to the bi-reports blob container.

  6. Email dispatch: Azure Communication Services sends each persona a styled HTML email containing the AI-generated insights and a link to the HTML report.

The pipeline is persona-driven: each of the four stakeholder groups (Project Managers, Engineers, R&D, and Operations) gets a different system prompt, a different chart plan, and is dispatched to its own recipient list.


Infrastructure

All cloud resources are provisioned with Azure Bicep IaC templates and deployed via a two-job GitHub Actions workflow:

Resource Purpose
Azure Storage Account (HNS) bi-data (input files) and bi-reports (HTML output) containers
Azure AI Foundry GPT-5.4-mini deployment for insight and chart-plan generation
Azure Key Vault Stores all secrets; never in plaintext in source control
Azure Communication Services Sends insight emails from a managed Azure domain
Azure Container Registry Hosts the Docker image
Container App Job Runs the pipeline on a cron schedule
User-Assigned Managed Identity Grants the job least-privilege access to ACR and Key Vault

The Container App Job reads secrets at deploy time via Key Vault secret references, so no credentials are passed as environment variables at runtime.


Testing

The test suite covers three layers:

  • Unit tests: pure-function tests for data utilities, logging, email templating, chart generation, and tool logic.
  • Integration tests: tests against live Azure services (Storage, AI, Communication Services) gated behind environment variable checks, so they are skipped in CI if credentials are absent.
  • End-to-end test: runs the full pipeline against a small fixture dataset and asserts that an email was sent and a report was uploaded.

Coverage is tracked via pytest-cov and the badge is automatically updated in CI.


Stack