Large Language Models Can Control Their Own Attention SpanMay 22, 2026ยท[P2]ยท 0 min readTypeConference paperPublicationPre-printLast updated on May 23, 2026Long-Context Efficient Inference mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT Mar 25, 2026 →