The communities, both in HPC and Data Science are growing steadily and finds themselves at the forefront of many scientific discoveries. During Covid the HPC community stepped up and offered its resources to find a fast solution for identifying and treating the virus. Given the numerous problems HPC is poised to solve, it becomes imperative that we have a diverse environment for people to grow. Diversity not only includes gender but also a diversity of thoughts, people from different backgrounds can provide insights into new ways of solving technical problems. The “Diversity in Computing” workshop has been scheduled with this thought in mind. Given the virtual nature of the HiPC 2021 conference, our talk and panel discussion will also be virtual.
Invited Lecture: The Silent Learner and the Oblivious Teacher: Meaning Aware Storage for AI. (40 minutes)
Panel Discussion: Opportunities and challenges in diversifying the HPC community (20 minutes)
Distinguished Technologist, Hewlett-Packard Enterprise
Title: The Silent Learner and the Oblivious Teacher: Meaning Aware Storage for AI.
Abstract: Artificial intelligence (AI) applications excel in deriving meaningful insights from data but they do not cope well with storage management challenges and data gravity constraints under a high pace of data generation. Meanwhile conventional data fabric software is optimized for data access rather than ensuring timely insights and value from huge volumes of data. The problem is rooted in a classical cross-layer dichotomy in system software design: The AI application layer lacks the deep knowledge needed to optimize the data services while the system layer is closest to the data but lacks understanding of the intent and what to prioritize.
We resolve this omniscience dilemma by introducing Meaning Aware Storage (MAS), a set of techniques to proactively optimize analytics computations and data storage. Meaning aware storage departs from traditional optimizations to prioritize insight rich data, saving resources expended on similar (or less informative) data or similar analyses, much like our brains learn to sift meaningful data. We achieve this through mechanisms that enable a data fabric to silently learn the purpose and relevance of stored data by intercepting the lineage of AI workﬂows under execution within existing analytics frameworks and subsequently allow partial analytics computations to be initiated proactively by the data fabric layer (where data is stored and managed) based on previously observed AI processes. Our proposed techniques aid AI applications by transparently providing them with relevant data or precomputed insights and by alleviating storage management and data gravity challenges using proactive tiering and data approximation. Such optimizations can result in up to an order of magnitude reduction in storage space occupied in the fastest tier and in time to value for AI applications.
Speaker Bio: Suparna Bhattacharya is a Distinguished Technologist in the Hewlett-Packard Enterprise Storage CTO office. She holds a PhD in Computer Science and Automation from the Indian Institute of Science and a B.Tech in Electronics and Electrical communication from IIT Kharagpur. Suparna worked at IBM from 1993-2014, delving into operating systems and file-system internals on various platforms. She made her foray into the Linux kernel in 2000 and got introduced to the joys of working on open source. Her contributions to Linux span multiple areas and she was regularly invited to chair sessions at the Linux Kernel Summit, a by-invitation-only event where key contributors of the Linux Kernel get together to decide the future roadmap of the Kernel. Suparna was elected to the IBM Academy of Technology in 2005 and in 2012, she moved to IBM’s research division after an educational leave of absence to pursue doctoral studies. Her dissertation work on “A systems perspective of software runtime bloat and its power-performance implications” impacted diverse research communities resulting in publications at top-tier venues such at SIGMETRICS, ECOOP, HotOS, HotPower, OOPLSA and ICML and was awarded the best PhD thesis in the department of computer science and automation at IISc. At IBM research, Suparna initiated exploratory projects on software defined memory and systems-software co-design for extreme-scale contextual and cognitive computing. These days at Hewlett-Packard Enterprise, she focuses on the implications of emerging non-volatile memory technologies in future systems.