I’m a problem-solver at my core. I welcome the challenge to work hard. I will struggle, I will learn, and I will find solutions. I am an achiever.
Core Technologies: Python, MySQL, Spark, Tableau, Pandas, NumPy, Matplotlib, Seaborn, SciKitLearn, Anaconda, Jupyter Notebooks, Git/GitHub
Core Competencies: Data Storytelling, Applied Statistics, Machine Learning, Natural Language Processing, Classification, Regression, Clustering, Time Series Analysis, Anomaly Detection
Hire Me Because
My Capstone Project: Datacenter Hard Drive Failures
On a team of four, predicted early hard drive failure. Backblaze is a cloud storage site that provides open source data on the hard drives utilized in their data center. The dataset was created by retrieving this data from 2016 – 2019 and aggregating the data by serial number using Apache Spark. After aggregation, the dataset was consolidated enough to use pandas for the remaining analysis. A SVM model was developed to predict early failures using SMART (Self-Monitoring, Analysis and Reporting Technology) stats. The findings were used to make clear recommendations regarding hard drive reliability based on a given hard drive’s manufacturer, model, and capacity. Deliverables include presentation slides, an analysis notebook, and a reliability index for top and bottom hard drive model.