About Patents Experience Publications Teaching Skills Contact
Yu-Cheng Tsai
Principal ML Scientist · 8× Patent Inventor · Agentic AI Researcher

Yu-Cheng Tsai, Ph.D.

Engineering the Future of
Agentic AI & Large Language Models

Princeton-trained AI Scientist building production-grade Generative AI systems for finance and accounting. Inventor of 8 US Patents spanning LLM architectures, GNNs, and computer vision. Author of 10+ influential technical articles on agentic AI and LLM fine-tuning.

scroll

Building AI at the Frontier

I'm a Principal Machine Learning Scientist at Sage, where I lead the architecture and deployment of Generative AI applications for finance and accounting. With a Ph.D. from Princeton University and a B.S. from National Taiwan University, I bring rigorous scientific discipline to practical AI systems that matter.

My work centers on Agentic AI — building systems that can reason, plan, and act autonomously. I design multi-agent orchestration pipelines using MCP, Pydantic AI, and LangChain, run distributed LLM fine-tuning with DeepSpeed & Ray on AWS, and ship production ML systems at scale.

I am an inventor on 8 US Patents and Applications spanning Generative AI, LLM architectures, Graph Neural Networks, and Computer Vision — and I actively contribute to the AI community through 10+ technical publications reaching thousands of practitioners globally.

Beyond technical work, I apply AI for social good: using Generative AI to advance math education and create impactful parenting resources for low-resource families.

8
US Patents & Applications
10+
Technical Publications
Ph.D.
Princeton University, 2013
4+
Years at Sage AI

Patents & Inventions

Inventor on 8 US Patents and Applications spanning Generative AI, LLM architectures, Graph Neural Networks, Computer Vision, and MLOps. Click any patent to view the full filing.

👁️
US10769198 Granted

Systems and Methods for Product Identification Using Image Analysis from Image Mask and Trained Neural Network

Computer vision framework applying image masking and trained neural networks for accurate product identification and segmentation — enabling scalable visual understanding in multi-modal AI pipelines.

Computer VisionImage SegmentationNeural NetworksDeep Learning
View Patent →
💬
App 20240394285 Application

Chatbot

Iterative LLM architectures for handling complex, multi-step queries by dynamically leveraging external APIs and databases. Advances conversational AI beyond single-turn responses into persistent, reasoning-capable agents.

LLMAgentic AIConversational AIMulti-step Reasoning
View Patent →
✍️
App 20240394481 Application

Prompt Generation

Novel methods for automated, context-aware prompt generation for large language models — improving response quality, task alignment, and downstream performance in enterprise AI applications.

Prompt EngineeringLLMGenerative AI
View Patent →
🛡️
App 20240394512 Application

Hallucination Detection

Systematic framework for detecting and mitigating hallucinations in LLM outputs — a critical reliability layer for deploying trustworthy generative AI in high-stakes financial and enterprise environments.

LLM ReliabilityHallucinationGenerative AIFinTech
View Patent →
🕸️
App 20250209301 Application

Generating Graph Model

GNN-based framework for constructing and training graph models over complex entity relationships — powering fraud detection, anomaly surfacing, and relational reasoning at enterprise scale.

Graph Neural NetworksGNNFraud DetectionFinTech
View Patent →
📊
App 20250209385 Application

Covariate Drift Detection

Methods for detecting covariate drift in production ML systems — enabling continuous monitoring of input distribution shifts to maintain model accuracy and reliability in live financial AI deployments.

MLOpsModel MonitoringDistribution ShiftReliability
View Patent →
📈
App 20260064736 Application

Data Resource Identification and Metric Calculation

Automated methods for identifying relevant data resources and computing derived metrics — providing intelligent data discovery and quantitative insight generation for AI-driven financial analytics.

Data IntelligenceAnalyticsAIFinTech
View Patent →
🖥️
App 20210350391 Application

Methods and Systems for Providing a Personalized User Interface

Adaptive UI personalization system that dynamically tailors interface elements to individual user behavior and preferences — improving engagement and usability in enterprise software products.

PersonalizationUXMLAdaptive Systems
View Patent →

Career Journey

Principal Machine Learning Scientist

Sage · Remote | 2021 – Present
Current
  • Spearheaded architecture & development of numerous Generative AI applications for finance and accounting, leveraging MCP, LangChain, Streamlit, and vector embedding databases
  • Led distributed parallel training and fine-tuning of LLM foundation models on AWS utilizing Ray, DeepSpeed, and HuggingFace for high-performance domain adaptation
  • Engineered and productionized an advanced financial document classification system using TensorFlow, Scikit-Learn, and Docker — resulting in a patented system architecture
  • Inventor on 8 US Patents and Applications spanning GenAI, LLM architectures, GNNs, and computer vision
MCPLangChainDeepSpeedRayAWSLLMsGNNs

Data Scientist

CaaStle · Mountain View, CA | 2018 – 2021
  • Built visual and text recommender systems using image segmentation neural networks
  • Implemented wide-and-deep learning and BERT models for fashion product search and discovery
BERTNLPRecommender SystemsImage Segmentation

Senior Research Project Leader

ASML · Santa Clara, CA | 2016 – 2018
  • Developed CNNs for semiconductor mask design, directly enabling production of state-of-the-art CPUs, GPUs, and TPUs
  • Applied deep learning to computational lithography — improving optical proximity correction accuracy at chip-manufacturing scale
CNNsComputer VisionPyTorchSemiconductor AI

Ph.D., Mechanical & Aerospace Engineering

Princeton University · 2013

Specialized in computational physics and numerical simulation. Developed deep expertise in high-performance computing and mathematical modeling — a rigorous foundation that now informs how I approach large-scale distributed ML systems.

Computational PhysicsHPCNumerical Methods

B.S., Mechanical Engineering

National Taiwan University · 2007

Publications & Insights

10+ highly cited articles on Medium (Sage AI, Data Science Collective) and Towards Data Science, focusing on LLM fine-tuning, agentic workflows, and scaling laws.

🌍

Tech for Good — Non-Profit Hackathons

Multi-disciplinary hackathons organized by Sage Foundation, applying agentic AI to real-world challenges for underserved communities.

2025 Hackathon Partner: Parenting for Lifelong Health

Rethinking Parenting Lessons for Low-Resource Families

Problem: ParentText — a WhatsApp/SMS parenting course reaching families in South Africa, Mexico & Malaysia — saw a 50% completion drop when delivered without human coaches. Static comic strips caused significant drop-off.

What was built
  • Dynamic, personalized micro-stories (<200 words) tailored to each caregiver's location, relationship, and child's profile using GPT-4o
  • LLM role-play simulations with Microsoft AutoGen — characters engage in organic dialogues around stress management and emotional connection
  • Text-to-speech narration for literacy accessibility
  • Culturally attuned video content generated with Google Veo3 (replacing Azure Sora for better cultural relevance)
  • Gamified SMS challenges and story-based cliffhangers for engagement
GPT-4oMicrosoft AutoGenGoogle Veo3Agentic AITTS
2024 Hackathon Partner: STACK Assessment (Moodle)

Advancing Math Education Through AI and Enhanced UX

Problem: STACK, the world's leading open-source math assessment system, lacked personalized learning paths. Educators spent excessive time compiling data; students had no adaptive progression.

What was built
  • Personalized quiz recommendation engine — ML model trained on student exam attempts predicts scores and adapts learning plans per student
  • Teacher dashboard with actionable insights, quiz analytics, and one-click Moodle integration
  • Interactive Figma prototype demonstrating the full educator and student journey
  • Proto-personas and empathy maps to center vulnerable learners in design decisions
ML RecommendationsMoodle/STACKFigmaAdaptive Learning

Courses & Mentorship

Sharing expertise in Generative AI with the next generation of developers and researchers.

Instructor Teens in AI

Generative AI: Build a Custom Note App

A hands-on course teaching students how to harness Generative AI to build real-world applications. Covers core GenAI concepts, prompt engineering, and practical implementation — empowering the next generation of AI practitioners through project-based learning.

Generative AI Prompt Engineering Project-Based Learning LLMs
★★★★★ 5.0

Technical Expertise

🤖 Agentic AI & LLMs

Large Language Models Agentic Frameworks Model Context Protocol (MCP) LangChain Instruction Fine-Tuning RAG Systems Multi-Agent Orchestration Vector Databases LLM Evaluation Prompt Engineering

Distributed Training

DeepSpeed Ray / Ray Train HuggingFace Transformers AWS (SageMaker, EC2) Distributed Data Parallelism Mixed Precision Training Gradient Checkpointing

🛠 ML Frameworks & Tools

PyTorch TensorFlow Scikit-Learn Streamlit Docker Apache Spark Hadoop PyTorch Geometric

🔬 Research Domains

Graph Neural Networks Natural Language Processing Fraud Detection Computer Vision Recommender Systems Time-Series Forecasting Multi-modal AI Computational Physics

Let's Connect

Open to conversations about Agentic AI research, LLM systems, and impactful AI applications. Reach out on any platform below.