Gaussian kernel regression with Matlab code
In this article, I will explain Gaussian Kernel Regression (or Gaussian Kernel Smoother, or Gaussian Kernel-based linear regression, RBF kernel regression) algorithm. Plus I will share my Matlab code for this algorithm.
If you already know the theory. Just download from here. <Download>
You can see how to use this function from the below. It is super easy.
From here, I will explain the theory.
Basically, this algorithm is a kernel based linear smoother algorithm and just the kernel is the Gaussian kernel. With this smoothing method, we can find a nonlinear regression function.
The linear smoother is expressed with the below equation
here x_i is the i_th training data input, y_i is the i_th training data output, K is a kernel function. x^* is a query point, y^* is the predicted output.
In this algorithm, we use the Gaussian Kernel which is expressed with the below equation. Another name of this functions is Radial Basis Function (RBF) because it is not exactly same with the Gaussian function.
With these equation, we can smooth the training data outputs, thus we can find a regression function.
This program <Download> was made for one-dimensional inputs. If you need multi-dimension, please leave a reply, see this article. I recently made a new version for multidimensional input.
For the optimization of kernel bandwidth, see my other article <Link>.
Then good luck.
-Mok-
—————————————————————————————————————————–
I am Youngmok Yun, and writing about robotics theories and my research.
My main site is http://youngmok.com, and Korean ver. is http://yunyoungmok.tistory.com.
—————————————————————————————————————————–
Thank you! This was very helpful.
One thing I noticed–I think you may be missing an ‘h’ term outside of the exponential. The equation for a gaussian has the term 1 / (h * sqrt(2 * pi)) outside the exponential, and your equation just has 1 / sqrt(2 * pi).
Hi, Chris, first of all, thank you for your interest on my posting.
Answer is Yes and No.
Yes, the Gaussian function needs to have the term, 1 / (h * sqrt(2 * pi)) . Good point. Thank you.
No, It is not necessary because the terms are cancelled. The denominator and numerator have the same terms. That is the reason why I said that the other name of this kernel is RBF (Radial basis function).
-Mok-
Hello,
nice work, you made it seem easy.
Why wouldn’t it work for multi-dimensional problem? And I need multi-D 🙂
Is this RBF algorithm the same as in neural networks? can it be used for classification problems?
Thanks,
I am really sorry for too late response, finally I made a code for multidimensional input. If you still need, you can download from the below article.
http://youngmok.com/gaussian-kernel-regression-for-multidimensional-feature-with-matlab-code/
Thank you for your interest on my blog.
Pingback: Kernel Regression | Chris McCormick
Hello,
Thanks, Nice work!
I have a couple of questions about the code.
1. What is xs(i)? Is it the same as x(i)? How I can adjust it in my code based on the training data such as xs(i) = traindata(i,:) instead of xs(i)=i
2. What is a range for the kernel bandwidth parameter (h in the code)? In order to tune the parameter, what is the range for any dataset?
I also need the code for more than one-dimensional input (multi-dimensional features). Can you please provide it to me?
Thank you very much
Sorry for too late response, I was too busy recently.
1. xs is a some point to be predicted. Another name is query point or test point. In my code, training data should be x and y.
2. h can be tuned by a cross validation method or just by user’s feeling ^_^. I will soon upload a cross validation program to optimize this h values.
3. I just uploaded a new version of Gaussian Kernel Regression code at the below article.
http://youngmok.com/gaussian-kernel-regression-for-multidimensional-feature-with-matlab-code/
Thank you for your interest on my blog.
I just made a program to optimize the kernel bandwidth. If you are still interested in see this article.
http://youngmok.com/gaussian-kernel-bandwidth-optimization-with-matlab-code/
Hello,
Thanks for our work.
I have the same questions. If I want use the kernel regression to train a model, and then use that model to the test dataset. How could I do. There will be large of training datasets and testing datasets.
Thanks again !
Ling
Hi, Lingchen, It was a very excellent question. To train a model, one common and general way is to use a cross-validation method (e.g., leave-one-out cross validation). If I explain briefly, among a data set, divide it into two groups; training set and test set. Based on the training data set build a model and evaluate the model with the test set. This is a simple concept of a cross validation method. This week, if time allowed, I will post a simple example of cross validation method.
Thank you for visiting my website, and have a nice day.
Hello,
Thanks for your reply. I have the concept about cross validation method. Just I do not know how to use the training datasets to build the model. By which I mean, I do not know how to creat the model function from the training dataset.
Yes, if it is possible, it would be very nice that you can give us a simple example about kernal regression by using training datasets and test datasets.
Thanks a lot again ! you are really nice ! Have a nice day !
Ling
Hello, Lingchen,
Oh… Now I understand what you are saying. I think you want to have a some close-form equation. for example f= ax+bx^2+c*cos(x) …. . But, unfortunately this smoother method does not give you such an equation. All of data points are actually a part of the model. This Gaussian Kernel regression approach is very different from the ploynomial fitting or traditional (?) fitting method. Please let me know if you need more.
-Mok-
Pingback: Gaussian Kernel Regression for Multidimensional Feature with Matlab code | Youngmok Yun: Roboticist in The Univ. of Texas at Austin
Pingback: Gaussian Kernel Bandwidth Optimization with Matlab Code | Youngmok Yun: Roboticist in The Univ. of Texas at Austin
Dear Sir,
What is xs(i)? Is it the same as x(i)?. Kindly guide.
xs is a query point. To make a mesh, I made lots of query points, or xs(i). x(i) is a training data. I hope that this answers your question.
Dear Sir,
Thank you very much for the quick reply. I have one more question to clear my doubt completely as given below
Question: Dear Sir you have given the following code as given above
for i=1:100
xs(i)=i;
ys(i)=gaussian_kern_reg(xs(i),x,y,h);
end
So in the place of xs(i) should i use the index values i.e. 1,2,3…,100 as you did in your code as given above or should i use xs(i)=x(i) where x is input data.
With Best Wishes,
Ganesh D. Kale
can you please write a code for a trauncated gaussian kernel in Matlab, also known as a Gaussian Finite Support within one standard deviation. normal gaussian goes from negative infinity to positive infinity, however trauncated gaussian or finite support goes from one standard deviation.. It will be most appreciated~!!
Sorry I don’t know what the truncated gaussian kernel, and I am nowadays busy, so I cannot write a code. Let me try later. Sorry
hey so i managed to do truncated gaussian…i wanted to see if you have a matlab code for gaussian that does polynomial fitting of nth order as opposed to the simple zero order of the nadaraya watson !
thanks