Project Name: Datacenter Hard Drive Failures
On a team of four, classified and predicted which hard drives models have high or low reliability. BackBlaze is a cloud storage site that provides open source data on the hard drives utilized in their data center. The dataset was created by retrieving this data from 2016 - 2019 and aggregating the data by serial number using Spark. After aggregation, the dataset was consolidated enough to use Python for the remaining analysis. A model was developed to predict early failures using SMART (Self-Monitoring, Analysis and Reporting Technology) stats. The findings were used to make clear recommendations regarding hard drive reliability based on a given hard drive's manufacturer, model, and capacity. Deliverables include presentation slides, an analysis notebook, and a reliability index for each hard drive model.
I'm a problem-solver at my core. I welcome the challenge to work hard. I will struggle, I will learn, and I will find solutions. I am an achiever.
being able to efficiently explore many paths, to ultimately find the best solution.
Analytical, detail-oriented, and a quick-learner
Goal-oriented, supportive, and impactful