Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference J. Yao, Q. Anthony, A. Shafi, H. Subramoni, D. Panda 38th IEEE International Parallel & Distributed Processing Symposium, May 2024.