Publications

. mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT. Pre-print
Post-training Multi-task Generalization
. Generative Visual Code Mobile World Models. Pre-print
PDF CODE PROJECT World Model Mobile GUI Code Generation VLM Post-training
. Predicting LLM Reasoning Performance with Small Proxy Model. ICLR 2026
PDF DATASET Pre-training Scaling Reasoning Efficiency
. AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners. NeurIPS 2025
PDF CODE Post-training Reasoning Data Sampling Training Efficiency
. C$^2$: Scalable Auto-Feedback for LLM-based Chart Generation. NAACL 2025 Main Long (Oral)
PDF CODE PROJECT VIDEO Chart Generation Code Generation VLM-as-a-Judge Inference-time Scaling
. FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL. ICLR 2025
PDF CODE PROJECT Multi-Agent RL Domain Generalization
. Encoding Temporal Statistical-space Priors via Augmented Representation. IJCAI 2024 STRL Workshop (Oral)
PDF Spatio-temporal Prediction Financial Markets
. Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series. AAAI 2024 AI4TS Workshop (Oral)
PDF Reinforcement Learning Financial Markets
. Network-based exploratory data analysis and explainable three-stage deep clustering for financial customer profiling. Engineering Applications of Artificial Intelligence (SCIE, Q1)
PDF Explainable AI Personalized AI