Career Profile

  • PhD in mechanical engineering with focus on wind energy and software development
  • Kaggle competition master (@shujian)
  • Computational mathematical researcher with 7+ years' experience in scientific software development.
  • Solid theoretical knowledge of machine learning, data mining and natural language processing.
  • Skilled in mathematical algorithms and high performance computing (CUDA/OpenMP).
  • Great passion for software development and data science.
       Profile for Mechanical Engineering


QueryGene: Intent Classifier for Genetic Test Chatbot
- Applied a bidirectional long short term memory (Bi-LSTM) model (implemented in Keras) to classify user intents, in order to retrieve information from database
- Unitized pre-trained model (GloVe) for word embedding to provide extra information besides the dataset
- Tuned parameters with Hyperas package and accuracy of 94% is achieved (old system has an accuracy of 85% with logistic regression)
- Developed a chat-bot web app to explain genetic test results with Bootstrap, AJAX and Flask. Deployed the project on AWS
Skills: Keras, Bi-LSTM, GloVe, Flask, AWS
Boston Travel Guide Application (Github)
- In Progress
Skills: ML, NLP, MySQL, JavaScript
Wind Power Prediction with Recurrent Neural Networks(LSTM)
- Generated and cleaned a dataset of 7 Gb
- Developed a RNN/LSTM model to predict wind power from wave and wind history
- Tuned parameters and achieved a MSE loss of 0.00043
Skills: Python, Keras, RNN
Long Text Summarization with NLP Techniques
- Tested a deep neural network on TensorFlow
- A rule-based summarization model is developed for sentence compression with NLTK library
- A hierarchical model (combination of LexRank and rule-based algorithms) is built for long text summarization
- Picking shortest-clause from most important sentence choosen by LexRank Algorhtm appears to have the best performance
Skills: Python, Neural networks, NLP
Online Snake Game
- Wrote the core functions of the game in Javascript
- Built the database by MySQL to store and display the scores
- Designed the page with CSS and Bootstrap template
- Created dynamic page by PHP and AJAX
Skills: JavaScript, PHP, AJAX, MySQL
Artist Recognition Using Machine Learning Methods
- Preprocessed the dataset to fit the problem
- Tested with several supervised and unsupervised learning methods
- Feature selection and dimensional reduction are used
- Accuracy of 7.2% is obtained with LDA method, compared to the accuracy of 2.1% with ANN method from the original authors.
Skills: Python, Scikit-learn, Data processing
Exploratory Data Analysis of Zhihu (Q&A website) Dataset
- Cleaned raw data in Json format and stored dataset in SQLite
- Manipulated the dataset with SQL queries and Pandas dataframe
- Virtualized useful information with Mapbox and Seaborn packages
Skills: Python, SQLite, Virtualization, Data Mining
Local Data Science Event Calendar
- Scraped local events from University websites with beautifulSoup
- Modified a calendar template to display events
- Built MySQL database to store and used AJAX to insert into MySQL
- Developed an administrative page to add event with PHP
Skills: BeautifulSoup4, PHP, AJAX, MySQL
Treecode Algorithm to Accelerate WInDS Simulator
- Modified the Barnes\textendash Hut Treecode Algorithm from C to CUDA and ran on GPU
- Modified the kernel from vortex particle to vortex filament
- Integrated this N-body algorithm to the wind turbine simulator WInDS
- Reduced the time complexity from O(n^2) to O(n log n)
Skills: C, CUDA, Parallel computing, Tree
Preconditioned Conjugate Gradient Methods for Solving Large Sparse Matrix in CSR Format - in a team of two members
- Implemented preconditioned conjugate gradient (PCG) method in Fortran
- Compared different storage approaches for the large sparse matrix
- Conducted tests on different cases.
Skills: Fortran, Numerical algorithms


College of Information and Computer Sciences
  • Information Systems (CS 445)
  • Machine Learning (CS 589)
  • Algorithms for Data Science (CS 590D)
  • Distributed and Operating Systems (CS 677)
  • Introduction to Natural Language Processing (CS 585)
Department of Mathematics & Statistics
  • Mathematical Statistics I (STAT 607)
  • Mathematical Statistics II (STAT 608)
  • Applied Stat & Data Analysis (STAT 697D)
Department of Electrical and Computer Engineering
  • Algorithms (ECE 665)
  • Numerical Algorithms (ECE 697NA)
Independent Coursework
  • Statistical Learning (Stanford Online)
  • R Programming (Coursera/JHU)
  • High Performance Scientific Computing (Coursera/UW)
  • Introduction to Recommender Systems (Coursera/UMN)
  • The Complete Web Developer Course (Udemy)
  • Introduction to Computer Systems (CMU 15213)
  • Functional Programming Principles in Scala (Coursera/EPFL)
  • Algorithms I & II (Coursera/Princeton)
  • Programming Languages, Part A (Coursera/UW)
  • Neural Networks for Machine Learning (Coursera/UofT)

Skills & Proficiency





MySQL & PostgreSQL