The recent advances in Artificial Intelligence and Big Data Analytics are driven by large amounts of data and computing hardware at extreme scale. On the other hand, the High Performance Computing (HPC) community has been tackling similar extreme scale computing challenges for several decades now. The purpose of this workshop is to bring together researchers and engineers from Artificial Intelligence, Big Data, and HPC communities on a common platform to share the latest challenges and opportunities. The focus of the workshop is designing, implementing, and evaluating Artificial Intelligence and Big Data workloads on massively parallel hardware equipped with multi/many-core CPUs and GPUs and connected with high-speed and low-latency networks like InfiniBand, Omni-Path, Slingshot, and others. The Artificial Intelligence workloads comprise of training and inferencing Deep Neural Networks as well as traditional models using a range of state-of-the-art Machine and Deep Learning frameworks including Scikit-learn, PyTorch, TensorFlow, ONNX, TensorRT, etc. In addition, the popular Big Data frameworks include Apache Spark, Dask, and Ray that are used to process large amounts of data on CPUs and GPUs to conduct insightful analysis.
The TIES 2024 workshop talks will cover a range of areas, including but not limited to:
Dan Stanzione, Texas Advanced Computing Center
Title: The NSF Leadership Computing Facility, and the National AI Research Resource.
In this talk, I’ll cover upcoming developments in computing research infrastructure to support future big data/AI/HPC platforms, including design inputs for future systems. I’ll also talk about early experience on the Vista systems, one of the first deployments of NVIDIA’s Grace CPU and Grace-Hopper CPU/GPU integration. I’ll also touch on the programming models we are supporting, shifts in user workloads, and upcoming research directions.