PhD Thesis Defense
We develop a unified framework for solving utility maximization problems (UMPs) in continuous-time stochastic control under high-dimensional, partially observed, and path-dependent uncertainty. Such settings arise in quantitative and engineering control problems where classical dynamic programming breaks down due to curse of dimensionality and non-Markovian structure. Our approach combines forward--backward stochastic differential equations (FBSDEs) with deep learning to construct scalable, simulation-based algorithms operating directly on trajectories. A central principle is the interplay between primal and dual formulations of stochastic control. We show that the duality gap, traditionally a post-hoc diagnostic, can be elevated to a principled driver of learning, yielding stable training and computable near-optimality. In regime-switching models, dynamic programming, convex duality and FBSDE representations yield a complete analytical and computational characterization. The framework extends to non-Markovian settings via filtering under latent regime uncertainty and functional It\^{o} calculus for long-memory dynamics, with transformer-based architectures enabling efficient computation directly from trajectories. Structural correctness and adaptability is confirmed on classical engineering problems including the linear-quadratic regulator. Finally, we introduce a robust Minmax formulation under model ambiguity delivers algorithms that jointly learn optimal strategies and worst-case models. Overall, this work provides a unified, scalable approach to stochastic control in complex environments, bridging together modern machine learning and classical control theory for high-dimensional decision-making.
