Collaborative Research: Cross-Layer Exploration of Non-Volatile Solid-State Memories to Achieve Effective I/O Stack for High-Performance Computing Systems
Sponsored by NSF-CCF
Abstract:
The objective of this research is to develop techniques that utilize solid-state memory technologies from device, circuit, architecture, and system perspectives across I/O hierarchy in order to exploit their true potential for improving I/O stack performance in high-performance computing systems.
I/O friendly memory system architectures will be developed to enable hybrid processor-memory 3D integrations with largely reduced off-chip I/O traffic. Adaptive cache management and hotspot prediction methods will be developed to address the low random write performance of solid-state drives, and data processing techniques will be developed to enable run-time configurable trade-offs among solid-state drive performance characteristics. A comprehensive full-system simulation infrastructure will be developed to evaluate and demonstrate the research under diverse high-performance computing workloads.
The research will facilitate the high-performance computing systems to most effectively utilize existing/emerging memory and processing technologies to tackle the grand I/O stack design challenge. It can greatly contribute to enabling high-performance computing systems to stay on track of their historic scaling, and hence benefit numerous real-life applications such as biology, chemistry, earth science, health care, etc. This project will also contribute to the society through engaging under-represented groups, research infrastructure dissemination for education and training, and outreach to high school students.
Personnel
- Investigators
- Dr. Xubin He, Virginia Commonwealth University, PI
- Collaborators
- Dr. Tao Li, University of Florida
- Dr. Tong Zhang, Rensselaer Polytechnic Institute
- Changsheng Xie, Huazhong University of Science and Technology
- Post Doctoral Researcher
- Dr. Ping Huang, March 2013-August 2013
- Graduate Students
- Guanying Wu , Virginia Commonwealth University (PhD student, Spring 2008-)
- Chentao Wu, Virginia Commonwealth University (PhD student, August 2010-December 2012)
- Ben Eckart, Tennessee Tech University (MS Student, Spring 2009-August 2010)
- Undergraduate Students
- Luke McNeese, Tennessee Tech University (Fall 2009-Spring 2010)
Recent Publications
- P. Huang and P. Subedi and X. He and S. He and K. Zhou (2014). FlexECC: Partially Relaxing ECC of MLC SSD for Better Cache Performance. The USENIX Annual Technical Conference (ATC). Philadelphia.
- P. Huang and G. Wu and X. He and W. Xiao (2014). An Aggressive Worn-out Flash Block Management Scheme to Alleviate the SSD Performance Degradation. The European Conference on Computer Systems (Eurosys). Amsterdam, The Netherlands.
- G. Wu and X. He, “Reducing SSD Access Latency via NAND Flash Program and Erase Suspension,” Journal of Systems Arhcitecture, Vol. 60, No. 4, 2014, pp. 345-356.
- G. Wu, X. He, N. Xie, and T. Zhang, “Exploiting Workload Dynamics to Improve SSD Read Latency via Differentiated Error Correction Codes,”ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol. 18, issue 4, October 2013.
- Hua Wang, Ping Huang, Shuang He, Ke Zhou, Chunhua Li, and Xubin He, “A Novel I/O Scheduler for SSD with Improved Performance and Lifetime”, Proc. of the 29th IEEE Symposium on Massive Storage Systems and Technologies (MSST), May 2013 (acceptance rate: 30 out of 109 submissions=27.5%).
- Chentao Wu and Xubin He, "GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling", Proc. of the 41st International Conference on Parallel Processing (ICPP'2012),Pittsburgh, PA, September 10-13, 2012 (acceptance rate: 28%).
- Chentao Wu, Xubin He, Jizhong Han, Huailiang Tan, and Changsheng Xie, "SDM: A Stripe-based Data Migration Scheme to Improve the Scalability of RAID-6", Proc. of the the IEEE International Conference on Cluster Computing(Cluster'2012), Beijing, September 24-28, 2012 (acceptance rate: 28.86%).
- G. Wu and X. He, "Delta FTL: Improving SSD Lifetime via Exploiting Content Locality", Proc. of the European Conference on Computer Systems (Eurosys'2012), acceptance rate: 27/178=15%.
- G. Wu and X. He, "Reducing SSD Read Latency via NAND Flash Program and Erase Suspension" , Proc. of the 10th USENIX Conference on File and Storage Technologies (FAST '12), acceptance rate: 26/137=19%.
- G. Wu, X. He, and B. Eckart, "An Adaptive Write Buffer Management Scheme for Flash-based SSD," ACM Transactions on Storage, Feb. 2012.
- Xin Chen, Xubin He, He Guo, and Yuxin Wang, “Design and Evaluation of an Online Anomaly Detector for Distributed Storage Systems”, Journal of Software, 2011 (in print).
- G. Wu, C. Wu, and X. He, "Latent Sector Error Modeling and Detection for NAND Flash-based SSDs", Poster session report, the 9th USENIX Conference on File and Storage Technologies (FAST2011), Feb 15-17, 2011.
- S. Wan, Q. Cao, J. Huang, S. Li, X. Li, S. Zhan, L. Yu, C. Xie, and X. He, "Victim Disk First: An Asymmetric Cache to Boost the Performance of Disk Arrays under Faulty Conditions", The USENIX Annual Technical Conference, Portland, OR, June 15-17, 2011 (acceptance rate: 27/180=15%).
- C. Wu, X. He, G. Wu, S. Wan, X. Liu, Q. Cao, and C. Xie, "HDP Code: A Horizontal-Diagonal Parity Code to Optimize I/O Load Balancing in RAID-6," Proceedings of the 41st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN2011), June 27-June 30, 2011, Hongkong, China (acceptance rate: 26/148=17.6%).
- C. Wu, S. Wan, X. He, Q. Cao, and C. Xie, "H-Code: A Hybrid MDS Array Code to Optimize Partial Stripe Writes in RAID-6", The 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS),Anchorage,Alaska, May 16-20, 2011.
- Chentao Wu, Xubin He, Qiang Cao and Changsheng Xie, "Hint-K: An Efficient Multi-level Cache Using K-step Hints", Proceedings of the 39th International Conference on Parallel Processing (ICPP), Sept. 13-16, 2010.
- G. Wu, X. He, N. Xie, and T. Zhang, "DiffECC: Improving SSD Read Performance Using Differentiated Error Correction Coding Schemes", The 18th Annual Meeting of the IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems, August 17-19, 2010. Best Paper Award Candidate.
- S. Wan, Q. Cao, C. Xie, B. Eckart, and X. He, "Code-M: A Non-MDS Erasure Code Scheme to Support Fast Recovery from up to Two-Disk Failures in Storage Systems", Proceedings of the 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2010), June 28-July 1, 2010.
- G. Wu, B. Eckart, and X. He, "BPAC: An Adaptive Write Buffer Management Scheme for Flash-based Solid State Drives," The 26th IEEE Symposium on Massive Storage Systems and Technologies (MSST2010), May 6-7, 2010.
- L. McNeese, G. Wu, and X. He, "The Hot Pages Associative Translation Layer for Solid State Drives, " Work in progress report, the 8th USENIX Conference on File and Storage Technologies (FAST2010), Feb 23-26, 2010.
- C. Wu, X. He, S. Wan, Q. Cao, C. Xie, “Hotspot Prediction and Cache in Distributed Stream-processing Storage Systems”, To be presented at the International Performance Computing and Communication Conference (IPCCC), December 14-16, 2009
Invention Disclosure and Patent
- Improving SSD Lifetime via Exploiting Content Locality, X. He and G. Wu, file pending, 2013.
Thesis/Dissertations
[PhD] Guanying Wu, "Performance and Reliability Study and Exploration of NAND Flash-based Solid State Drives", Date Graduated: August 2013. First employment after graduation: LSI Inc., Denver, CO
[PhD] Chentao Wu, “Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes”, Date Graduated: December 2012. First employment after graduation: Assistant Professor, Shanghai Jiaotong University, Shanghai, China.
[MS] Guanying Wu, "Design and Evaluation of an Adaptive Write Buffer Cache for Solid State Drives ", Date Graduated: December 2009. First employment after graduation: PhD candidate at TTU/VCU.
Sponsor
National Science Foundation (NSF)