Developing the Machine learning Model


Observation while building the model 

I noticed that the data I am working with is labeled data, with features and a label outcome. This means I have to look at it from a prediction perspective. prediction can be a classification or a regression problem. In my case, the outcome has only two class labels. 0 or 1 meaning the drill bit is down for mining or up to move to the next mining ground. Hence binary classification. Overall there are obvious relationships and some that are not. Regression will work best in advancing to the output label being a fluctuating value between an upper threshold and lower threshold value.

For the data, I initially loaded the data and use scatter matrix and histogram functions from pandas and matplotlib libraries to try and see if there are more relations or something new. To teach the model the relationship between the feature and the outcome, I used several algorithms and compared them to see which one performed best in this scenario. see code below:

Loading and Plotting the Data

1. To check if the data was loaded and made sure I could view it.  I used the df.head(10) function to make sure I can at least view 10 rows of the loaded data. additionally, I set the index to the time-logged column.


2. Once the data was loaded I wanted to test if I can plot it. so I used the plt. plot(df) function to see if anything would be displayed. I found that the plotted diagram does display any meaningful data.

3. I similarly played with the shapes of the data just to have a good understanding of what I was looking at. 

4. From the histograms I found that some features have a gaussian distribution and thus are important when looking at probability distributions, I will reserve some more time to look into this area. 


5. The scatter matrix was a bit overwhelming but I noticed relationships that I would not have picked up just by looking at the data. I need a lot more knowledge on how to read this chart. I will b asking Eris, for more assistance.


Building the Model

6. Using sklearn functions, i split the data into training and test data. the training data I use to train the model and the test for testing the model's performance on unseen data.  see test results below:

ALGORITHM TEST RESULTS

7.  I wasn't too happy with the results, I will continue investigating how best to train the model to get better training results. below is the code to build the model and use it for prediction.


Predictions 

The prediction seems to be working well, I ran the model using the prediction function provided by sklearn. but again the results are ok but I would like to keep on working with it. I plan to export a different random day of data and see how that performs, just for comparisons. and also refine the data to a well-performing day or days of data, where the mining rate is favorable.

 

Reference  

Most of the code to build the model i souced from the website below:

https://machinelearningmastery.com/machine-learning-in-python-step-by-step/

Comments

Popular posts from this blog

Project objective