In machine learning you may have already heard the term ‘HyperParameter’. What actually that means?

In almost all predictive model we define some parameter for our ‘Model’. For example if you are going to predict the age of a person based on the given height , weight and sex, then the parameters for your ML model may be-

  1. Height
  2. Weight
  3. Sex
  4. Etc

And these parameters can be decided from the training data , you are going to deal with.

But sometimes we also deal with some other parameter that can not be decided from our training data. We have to select some technically picked random values and train our model with different loss functions and evaluate it for our desired output. Then we change the values for those parameters that take us closer to our desired output. That type of parameters are knows as ‘HyperParameters’.

For example, here are some hyperparameters datascientists generally deal with:

  1. Learning rate of a model
  2. Number of hidden layers in deep NN
  3. Number of leaves or dept of a tree
  4. Number of clusters in k-means clustering
  5. etc

The parameters mentioned above holds certain values that you can not decide from your dataset or from the architecture of your model. But surely you can pick the values of those parameters from your experience or by making several trials for some random values and choose which one suits your model.

So in conclusion, we can say, hyperparameters are some parameters that are different from normal parameters and can be decided only from experience or trial of a few random values or thousands..