Enabling Big-data Computing Workflows in High-performance Networks

Chase Q. Wu
Associate Professor, Department of Computer Science and Director of the Center for Big Data
New Jersey Institute of Technology (NJIT)
SERC 306
Friday, November 17, 2017 - 14:00
Many applications in various science, business, and industry domains are producing colossal amounts of data, now frequently termed as “big data”, on the order of terabyte at present and petabyte or even exabyte in the predictable future. No matter which type of data is considered, an end-to-end computing solution that facilitates data transfer, processing, visualization, and analytics would be essential for scientific research, knowledge discovery, or business intelligence. Such computing solutions are typically built upon data- and network-intensive workflows comprised of computing modules with complex dependencies. The goal of our research is to develop an integrated and automated workflow solution to support big-data applications in high-performance networks. Together with science collaborators at national laboratories within U.S. Department of Energy, we design a three-layer workflow architecture where the workflow performance is optimized through the co-scheduling of computing and networking resources based on resource abstraction, bandwidth reservation, and workflow mapping. This talk provides a brief tutorial on big-data scientific applications and shares our research results on various enabling technologies based on rigorous algorithm design, theoretical dynamics analysis, and real network implementation, deployment, and evaluation.