Senior Data Engineer Β βΒ AI Platform Engineer
Databricks Lakehouse Β Β·Β Azure Β Β·Β LLM Pipelines Β Β·Β Agentic RAG Β Β·Β Banking & Cybersecurity
I'm a Senior Data Engineer (VP-level) with 10+ years building production-grade data platforms across banking, cybersecurity, risk technology, and healthcare β and I'm now extending that foundation into AI platform engineering.
Most AI systems fail because the data layer is broken. I build both.
I've led enterprise Lakehouse programs at Tier-1 financial institutions, processing 30M+ records daily across full Bronze/Silver/Gold medallion architectures with real security tooling data. I design the metadata-driven frameworks, governance layers, and CI/CD foundations that teams actually adopt β not just POCs that get shelved.
I'm now building LLM pipelines, Agentic RAG systems, and AI-powered DataOps tools on top of the same governed data infrastructure I've spent a decade perfecting β bridging the gap between enterprise data platforms and the AI workloads companies are rushing to deploy.
ruby = {
"shipped": ["Agentic RAG Knowledge Assistant π§ ", "M&A Oracle (team capstone) π¦"],
"currently_building": ["CyberLens π΅", "QueryForge π’", "PipelineGuardian π‘"],
"production_stack": ["Databricks", "Delta Lake", "PySpark", "Azure", "Unity Catalog"],
"ai_stack": ["Claude API", "LangGraph", "RAG", "MLflow", "ChromaDB", "RAGAS"],
"domain_expertise": ["Banking", "Cybersecurity", "Risk Technology", "Healthcare", "M&A / PE"],
"open_to": ["Senior Data Engineer", "AI Platform Engineer", "ML Platform Engineer"],
}Real systems. Real code. Running now.
| Project | What It Does | Key Stack | Status |
|---|---|---|---|
| π§ Claude Cert Knowledge Assistant | Domain-specific Agentic RAG system with a 3-tier adaptive execution model: Tier 0 (direct LLM), Tier 1 (single-source retrieval), Tier 2 (multi-hop decomposition across sources). Implements Corrective RAG with query rewriting, Self-RAG hallucination grading, semantic caching, and hierarchical parent-child chunking. RAGAS evaluation suite included. | LangGraph Β· LangChain Β· ChromaDB Β· OpenAI API Β· RAGAS Β· Brave Search Β· Pydantic | β Live |
| π¦ M&A Oracle (team capstone) | [Shipping Apr 20, 2026] Enterprise RAG system for private equity due diligence β surfaces contradictions between management earnings calls and SEC filing footnotes. Integrates 7+ data sources (SEC EDGAR 10-K/10-Q, USPTO patents, earnings transcripts) via a Knowledge Graph, RAG Router, and multi-step agentic reasoning. Enterprise-grade observability, audit trails, and Slack notifications. | Knowledge Graph Β· RAG Router Β· SEC EDGAR Β· USPTO API Β· LangGraph Β· Agentic AI Β· Multimodal RAG | π¨ Shipping Apr 20, 2026 |
Bridging 10 years of enterprise Data Engineering into the AI platform layer. Each project applies production-grade patterns from real regulated environments β banking, cybersecurity, and healthcare.
| Project | What It Does | Key Stack | Status |
|---|---|---|---|
| π΅ CyberLens | Security Data Lakehouse + Agentic RAG platform. Ingests multi-source security telemetry into a Bronze/Silver/Gold medallion architecture, then layers a LangGraph agent that answers natural language questions about enterprise security posture by reasoning over both structured Delta data and unstructured threat intel. | Databricks Β· Delta Lake Β· LangGraph Β· Claude API Β· ChromaDB Β· FastAPI Β· MLflow Β· Docker | π¨ Building |
| π’ QueryForge | Production-grade Text-to-SQL LLMOps platform β the open-source version of what Databricks Genie and Snowflake Cortex Analyst are commercializing. Prompt versioning in MLflow, RAGAS evaluation CI/CD that fails builds on accuracy drops, SQL validation layer, and a user feedback loop for continuous improvement. | Claude API Β· MLflow Β· RAGAS Β· FastAPI Β· PostgreSQL Β· GitHub Actions Β· Streamlit Β· Databricks | π¨ Building |
| π‘ PipelineGuardian | Agentic DataOps monitor that detects pipeline anomalies (schema drift, volume drops, SLA breaches), gathers lineage context, generates LLM-powered root cause analysis via a 4-node LangGraph workflow, and auto-creates incident tickets with AI-written descriptions β turning hours of manual triage into 60-second automated resolution. | LangGraph Β· Claude API Β· Delta Lake Β· Docker Β· FastAPI Β· MLflow Β· Streamlit Β· Databricks | π¨ Building |
Data Platforms & Engineering
AI & LLM Engineering
Languages
Cloud & DevOps
Databases
| π΅ Microsoft | π Databricks | π€ Anthropic |
|
β
Azure Data Engineer Associate (DP-203) β Azure Data Scientist Associate (DP-100) β Fabric Analytics Engineer (DP-600) β Power BI Data Analyst (PL-300) β Azure AI Fundamentals (AI-900) β Azure Data Fundamentals (DP-900) β Azure Fundamentals (AZ-900) |
β
Lakehouse Fundamentals π Data Engineer Professional (May 2026) β Building Single-Agent Apps on Databricks β GenAI App Deployment & Monitoring β Building Retrieval Agents on Databricks |
π Claude Code Architect Foundations (Apr 2026) β Claude Code 101 π Building with the Claude API π Introduction to Agent Skills |
This is a live job search sprint. I'm building in public.
| π¨ Active Build | CyberLens Β· QueryForge Β· PipelineGuardian β all three in parallel |
| π Active Learning | Databricks Mosaic AI Β· Azure AI Foundry Β· RAG Architect Boot Camp (datasenseai.com) |
| π Certs in Progress | Databricks Data Engineer Professional Β· Claude Code Architect Foundations |
| π Available | Immediately Β· New Jersey / Remote / Hybrid |
| π€ Looking for | Senior DE Β· AI Platform Engineer Β· ML Platform Engineer |
- Databricks Mosaic AI β Vector Search, Model Serving, AI Gateway, Genie patterns
- LLMOps β Prompt versioning, RAGAS evaluation, agent observability, cost tracking
- Azure AI Foundry β Prompt Flow, Azure AI Search, multi-provider LLM abstraction
- Agentic System Design β Multi-agent orchestration, tool use, context engineering, evaluation harnesses
- RAG Architect Boot Camp β 8-week enterprise RAG solution builder (datasenseai.com)
I'm actively exploring Senior Data Engineering and AI Platform Engineering roles where I can bridge enterprise data platforms with production AI workloads.
- π Location: New Jersey β open to remote & hybrid roles across the US
- π§ Email: roopmathi.gj@gmail.com
- πΌ LinkedIn: linkedin.com/in/roopmathi
- π¨π¦ Work Auth: Canadian citizen Β· TN visa Β· processes at US border in 1 day Β· no USCIS wait Β· no lottery
"Most AI systems fail because of the data layer. I build both."