I am a Ph.D. Candidate in Computer Engineering at Northwestern University under Prof. Alok Choudhary, expecting to graduate around June 2020. My research works mainly focuses on building artificial intelligence (AI) based predictive models for diverse scientific applications by leveraging together big data, machine learning and high performance computing systems. During last four years, I have designed several deep learning based systems which can directly learn the variables of scientific interest from raw inputs from scientific experiments and simulations (chemical formulae, crystal structures, electron diffractions, 1D and 2D X-ray diffractions) using frameworks such as TensorFlow, PyTorch, Caffe and Theano. I have also explored the area of parallelization of deep neural networks, and implemented a hybrid parallel framework for parallelization of recent CNN models such as Inception and ResNets on supercomputers using C++, Caffe and MPI. Recently, I contributed in developing a reinforcement-based neural architecture search framework for diverse scientific applications at Argonne. Currently, I am looking forward to keep learning more about advanced AI techniques such as generative adversial networks and reinforcement learning, and apply my knowledge towards building intelligent AI-based systems to solve real-world and scientific problems.
Prior to machine learning and artificial intelligence, I explored the field of computer networks and distributed systems under Prof. Fabian Bustamante, and obtained a M.S. in Computer Science from Northwestern University. Before joining Northwestern, I completed my Bachelor of Engineering in Computer Science with distinction from Institute of Engineering at Tribhuvan University in Nepal.
Designing deep-learning based scalable intelligent predictive models for complex scientific datasets from simulations and experiments.
Advising the software engineers and data scientists to build AI tools using the state-of-the-art deep learning architectures for real-time image recognition.
Built a neural architecture search framework for automation of search for deep neural architectures for scientific datasets using reinforcement learning on supercomputers.
Taught courses such as C, C++, Java, Object Oriented Analysis and Design, Theory of Computation and Software Engineering; supervising students on different kinds of undergraduate course and thesis projects in computer science and engineering.
Taught Theory of Computation to undergraduate students in computer science and engineering.
Developed Information and Content Management Systems for educational and medical institutes in Nepal.
Machine Learning, Deep Learning, and Artificial Intelligence.
Wireless and Cellular Networks.
Graduated with distinction, achieved first rank (University Topper) among 3000 undergraduate student (across all disciplines).
Conventional machine learning approaches for predicting material properties from elemental compositions have emphasized the importance of leveraging domain knowledge when designing model inputs. Here, we demonstrate that by using a deep learning approach, we can bypass such manual feature engineering requiring domain knowledge and achieve much better results, even with only a few thousand training samples. We present the design and implementation of a deep neural network model referred to as ElemNet; it automatically captures the physical and chemical interactions and similarities between different elements using artificial intelligence which allows it to predict the materials properties with better accuracy and speed. The speed and best-in-class accuracy of ElemNet enable us to perform a fast and robust screening for new material candidates in a huge combinatorial space; where we predict hundreds of thousands of chemical systems that could contain yet-undiscovered compounds (published in Nature Scientific Reports).
We present a deep learning approach to the indexing of Electron Backscatter Diffraction (EBSD) patterns. We design and implement a deep convolutional neural network architecture to predict crystal orientation from the EBSD patterns. We design a differentiable approximation to the disorientation function between the predicted crystal orientation and the ground truth; the deep learning model optimizes for the mean disorientation error between the predicted crystal orientation and the ground truth using stochastic gradient descent. The deep learning model is trained using 374,852 EBSD patterns of polycrystalline Nickel from simulation and evaluated using 1,000 experimental EBSD patterns of polycrystalline Nickel. The deep learning model results in a mean disorientation error of 0.548°ree compared to 0.652°ree using dictionary based indexing (published in Microscopy and Microanalysis).
X-ray diffraction (XRD) is a well-known technique used by scientists and engineers to determine the atomic-scale structures as a basis for understanding the composition-structure-property relationship of materials. The current approach for the analysis of XRD data is a multi-stage process requiring several intensive computations such as integration along 2&theta for conversion to 1D patterns (intensity-2&theta), background removal by polynomial fitting, and indexing against a large database of reference peaks. It impacts the decisions about the subsequent experiments of the materials under investigation and delays the overall process. In this paper, we focus on eliminating such multi-stage XRD analysis by directly learning the phase regions from the raw (2D) XRD image. We introduce a peak area detection network (PADNet) that directly learns to predict the phase regions using the raw XRD patterns without any need for explicit preprocessing and background removal. PADNet contains specially designed large symmetrical convolutional filters at the first layer to capture the peaks and automatically remove the background by computing the difference in intensity counts across different symmetries. We evaluate PADNet using two sets of XRD patterns collected from SLAC and Bruker D-8 for the Sn-Ti-Zn-O composition space; each set contains 177 experimental XRD patterns with their phase regions. We find that PADNet can successfully classify the XRD patterns independent of the presence of background noise and perform better than the current approach of extrapolating phase region labels based on 1D XRD patterns (accepted to International Joint Conference on Neural Networks).
Developed a 48-layered residual DNN model to learn the materials properties from vectors composed of their crystal structures and composition; the DNN model significantly outperformed existing machine learning models on multiple tasks from multiple datasets for prediction modeling without any feature engineering.
Automating the search for neural network with dynamic architectures for different types of scientific datasets using reinforcement learning.
Scaling up training of the state-of-the-art deep learning models such as Inception and ResNet using a hybrid parallel pipelining approach on supercomputers.