// hello world
WALEED
KHALIL.
AI Engineer building production LLM-integrated systems, multi-agent orchestration, and RAG pipelines. Cloud-native platforms on Azure processing 825K+ records at sub-10ms latency.

Tech Stack
01 — About
Building AI systems
that ship to production.
AI Engineer with production experience architecting LLM-integrated systems, multi-agent orchestration, and RAG pipelines for regulated financial services environments. Built cloud-native AI platforms on Azure AKS processing 825K+ enterprise records at sub-10ms latency.
Strong Python and MLOps foundation, hands-on across the full AI stack from prompt engineering and fine-tuning through evaluation frameworks, vector databases, and production deployment. Measurable outcomes: 40% token cost reduction, 98% schema compliance, and 85%+ autonomous failure resolution.
AI & GenAI
LangChain, RAG, Multi-Agent, LLMs
Backend & APIs
Python, FastAPI, Node.js, GraphQL
Cloud & MLOps
Azure AKS, AWS, Docker, K8s
Data & Infra
DuckDB, Snowflake, PostgreSQL, Redis
825K+
Records Processed
<10ms
Inferencing Latency
40%
Token Cost Reduction
98%
Schema Compliance
02 — Experience
Where I've worked.
NxtHumans
Philadelphia, PAAI Engineer
- Architected and deployed production LLM-integrated AI platform on Azure AKS for a regulated insurance organization processing 825K+ records across 4 brands — 8K+ Terraform lines, 50+ resources, 3 environments, full MLOps observability
- Engineered TOON serialization layer with Pydantic schema enforcement for structured LLM output consistency across 15 personas; achieved 40% token cost reduction and measurable accuracy improvement on Azure AI Foundry + Azure OpenAI
- Designed prompt engineering pipelines and AI automation workflows with stakeholder feedback loops; shipped iterative production improvements via blue-green deployments with Application Insights and full audit observability
- Built real-time AI decision support layer: DuckDB synced from Snowflake, two-tier caching, SQL consolidation (16 queries to 1 CTE) — achieving sub-10ms inferencing latency on live regulated financial data
United Safety & Survivability Corporation
Exton, PAData Engineering Intern
- Developed Power BI dashboards and Power Automate ETL workflows reducing operational reporting time by 25% across 3 departments
03 — Skills
Technologies I work with.
AI & GenAI
ML & Backend
Cloud & MLOps
Data & Infra
04 — Projects
Things I've built.
Multi-Agent Task Orchestrator
Production multi-agent system using Google ADK + LangGraph that decomposes requests into specialized sub-agents for research, code generation, and data analysis. Dynamic tool calling with automatic function routing.
AI-Powered API Testing Agent
Autonomous AI agent that ingests OpenAPI specs and applies RAG-based constraint inference to generate, execute, and refine production test suites. Drove coverage from ~20% to 80%+.
Autonomous Data Pipeline Agent
Self-healing ETL agent with MLOps-style decision loop: schema comparison, data profiling, and SQL patch generation against live PostgreSQL — resolved 85%+ of pipeline failures autonomously.
05 — Services
What I can build for you.
LLM-Integrated Systems
Production AI platforms with LLM integration, prompt engineering pipelines, structured output enforcement, and full MLOps observability.
Multi-Agent Orchestration
Designing and building autonomous agent systems with dynamic tool calling, inter-agent communication, and agentic exploration loops.
RAG & Data Pipelines
RAG pipelines with vector databases, ETL automation, real-time inferencing layers, and self-healing data pipeline agents.
Cloud & MLOps
Cloud-native deployments on Azure/AWS with Kubernetes, Terraform IaC, CI/CD pipelines, and production-grade monitoring.
06 — Education
Background & credentials.
B.S. Computer Science
West Chester University of Pennsylvania
Relevant Coursework
Certifications
Cloud & AI Professional Certifications
07 — Contact
Let's work together.
I'm always open to discussing new projects, creative ideas, or opportunities to be part of something great. Feel free to reach out.