This document lists 4 potential projects related to machine learning and neural networks. Project 1 involves profiling neural networks using a tool called NeuroSim to build performance models. Project 2 develops a network-on-chip architecture to minimize communication latency for different neural networks. Project 3 maps layers of large language models to cores/threads to reduce energy-delay product. Project 4 performs hardware-software co-optimization of graph neural networks by rearranging adjacency matrices to trade off accuracy and inference latency.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
71 views1 page
List of Projects
This document lists 4 potential projects related to machine learning and neural networks. Project 1 involves profiling neural networks using a tool called NeuroSim to build performance models. Project 2 develops a network-on-chip architecture to minimize communication latency for different neural networks. Project 3 maps layers of large language models to cores/threads to reduce energy-delay product. Project 4 performs hardware-software co-optimization of graph neural networks by rearranging adjacency matrices to trade off accuracy and inference latency.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1
List of Projects
(You may choose your own project)
E0 294: Systems for Machine Learning (Jan’24) 1. Familiarization with In-memory Computing (IMC)-based tool. NeuroSim is a popular and widely used tool to estimate performance of an IMC-based system executing DNN inference or training. The tool is open source. First, use the tool to profile a various layer of DNNs. Then, use the profiling result to construct a performance model of DNN as a function of different parameters. The profiling needs to be rigorous. The parameter space should be exhaustive. 2. Communication-aware DNN. Construct a network-on-chip (NoC) architecture which provides minimum communication latency for a given deep neural network (DNN). Communication latency of a DNN arises due to data communication between two consecutive layers of DNN. The experiments need to be shown with different DNNs having different connection patterns (linear, residual, dense). The communication latency should be compared with an existing NoC topology. The implementation is expected to be on a cycle accurate simulator. 3. Mapping of a LLM layers on a multicore architecture to reduce EDP. Large language models (LLMs) consist of several computation blocks. Map the different computation blocks into different core/threads of a multicore platform to minimize energy-delay product (EDP). You need to implement at least 5 varieties of LLM and show that same technique is applicable to all of them. The implementation is expected to be on a real multicore platform. 4. HW-SW co-optimization of Graph Neural Network (GNN). GNNs consist of large adjacency matrix. Train a GNN with original adjacency matrix and obtain the accuracy. Re-arrange the adjacency matrix to get a chunk of and reduce computation. However, the re-arrangement will lead to loss in accuracy. The objective is to obtain a trade-off between accuracy and inference latency of a given GNN. The experiments need to be performed with at least 5 different GNNs corresponding to different data set. The implementation is expected to be on a real computer system.