Open to Software Engineering roles

Full-Stack, Cloud
& AI Engineer

I build intelligent, scalable, and user-centric systems — from multi-agent AI pipelines to cloud-native infrastructure.

Sai Vivekanand Reddy
01 — About

About Me

I am a Computer Science graduate student at Northeastern University with 3 years of professional experience as a software engineer. I specialize in bridging the gap between high-performance backend systems, scalable cloud infrastructure, and cutting-edge generative AI.

Technical Skills

Languages

PythonJavaGoSQL

AI / ML

PyTorchLangChainLangGraphRAG

Cloud & DevOps

AWSGCPDockerKubernetesTerraform

Frameworks & Data

ReactFastAPISpring BootPostgreSQLAirflow
02 — Experience

Work Experience

AI Engineer
Humanitarians AI
Jan 2025 – May 2025
  • Built multi-agent AI infrastructure in Python/FastAPI with autonomous task routing
  • Designed 4-tier memory (ChromaDB + OpenSearch) for scalable retrieval
  • Shipped multimodal document pipeline with Gemini Vision; high OCR/table accuracy
Member of Technical Staff
Puddl
June 2022 – August 2023
  • Developed and maintained core platform features using modern web technologies
  • Collaborated with cross-functional teams to deliver scalable solutions
  • Implemented robust testing and deployment practices
Software Engineer
Blue Yonder
Jul 2020 – May 2022
  • Optimized Demand Workbench data loads (~40% faster initial loads)
  • Refactored Save & Calc into staged operations enabling rapid iteration
  • Resolved complex full-stack defects with strong test coverage and secure communications
03 — Projects

Projects

Distributed Systems

Kafka from Scratch

  • Kafka broker built in Go — TCP server, wire-protocol framing, ApiVersions & Fetch APIs
  • Concurrent connection handling with goroutines and length-prefixed binary parsing
Go TCP Concurrency
View on GitHub →
HPC / ML

Multi-GPU Distributed Training

  • CPU (Joblib/Dask) + multi-GPU pipelines; significant preprocessing speedups
  • Mixed-precision + PyTorch DDP for scalable training
PyTorch Dask CUDA
View on GitHub →
Cloud / DevOps

Scalable Cloud-Native App (GCP)

  • Spring Boot + MySQL; auth & email verification; auto-scales on GCP MIGs
  • Terraform IaC; GitHub Actions + Packer for rolling deployments
Spring Boot Terraform GCP
View on GitHub →
AI / Data

Multimodal RAG for CFA Publications

  • Retrieval over text, tables & images from financial publications
  • Airflow ingestion, Milvus vectors, Snowflake, FastAPI + Streamlit on AWS
Milvus Snowflake Airflow
View on GitHub →
AI / Agents

Multi-Agent RAG System

  • LangGraph agents orchestrate web and academic retrieval to synthesize insights
  • Airflow + Docling + Pinecone pipeline; FastAPI backend and UI
LangGraph Airflow Pinecone
View on GitHub →
AI / ML

Text2SQL Model Distillation

  • Distilled a 0.6B Qwen3 model for natural-language → SQL on CSV data
  • 2× over base model — 74% LLM-as-judge at 1/1000th the teacher's size
Python Qwen3 Ollama
View on GitHub →
AI / Full-Stack

ChatDoc — PDF Analysis

  • Upload PDFs, extract text, summarize, and ask questions with Mistral AI
  • FastAPI backend with a Streamlit conversational interface
Mistral AI FastAPI Streamlit
View on GitHub →
Backend / Media

YouTube Lite

  • Video platform supporting upload, processing, and playback
  • Backend services for media ingestion and transcoding workflows
Backend Media Streaming
View on GitHub →