Gaussian Kernel Bandwidth Optimization with Matlab Code
In this article, I write on “Optimization of Gaussian Kernel Bandwidth” with Matlab Code.
First, I will briefly explain a methodology to optimize bandwidth values of Gaussian Kernel for regression problems. In other words, I will explain about “Cross validation Method.”
Then, I will share my Matlab code which optimizes the bandwidths of Gaussian Kernel for Gaussian Kernel Regression. For the theory and source code of the regression, read my previous posts <link for 1D input>, <link for multidimensional input>. This Matlab code can optimize bandwidths for multidimensional inputs. If you know the theory of cross validation, or if you don’t need to know the algorithm of my program, just download the zip file from the below link, then execute demo programs. Probably, you can use the program without big difficulties.
1. Bandwidth optimization by a cross validation method
The most common way to optimize a regression parameter is to use a cross validation method. If you want to know about the cross validation deeply, I want to recommend to read this article. Here I will shortly explain about the cross validation method that I am using. This is just a way of cross validation.
1. Randomly sample 75% of the data set, and put into the training data set, and put the remaining part into the test set.
2. Using the training data set, build a regression model. Based on the model, predict the outputs of the test set.
3. Compare between the predicted output, and the actual output. Then, find the best model (best bandwidth) to minimize the gap (e.g, RMSE) between the predicted and actual outputs.
2. Matlab code for the algorithm
You can download all functions and demo programs from the below link.
This program is for multidimensional inputs (of course, 1D is also OK). The most important function is Opt_Hyp_Gauss_Ker_Reg( h0,x,y ) and it requires Matlab optimization toolbox. I am attaching two demo programs and their results. I made these demo programs as much as I can. So, I believe that everybody can understand.
<Demo 1D>
<Demo 2D>
I wish this post can save your time and efforts in your work. If you have any question, please leave a reply.
-Mok-
—————————————————————————————————
I am Youngmok Yun, and writing about robotics theories and my research.
My main site is http://youngmok.com, and Korean ver. is http://yunyoungmok.tistory.com.
—————————————————————————————————
Pingback: Gaussian Kernel Regression for Multidimensional Feature with Matlab code | Youngmok Yun: Roboticist in The Univ. of Texas at Austin
Pingback: Gaussian kernel regression with Matlab code (Gaussian Kernel or RBF Smoother) | Youngmok Yun: Roboticist in The Univ. of Texas at Austin
Hi Youngmok,
I’m the 1st year in finance student and I’m very new with kernel estimation and I really interresting in this. So, if you don’t mild please give me some examples how to apply this into financial area such as if I input ‘x’ as return and ‘y’ as speeds of update(HFT proxy, 300 buys or sells order per minute). If i put this variable, what I gonna get for kernel. Please share me some of your knowledge .
Regards,
Pongsutti
Hi, Pongsutti,
Thanks for visiting my blog. This is one of many possible regression methods, simply saying. It is a curve fitting method. With the fitted curve, you can use this curve for predicting some output for a given input. I have not many knowledge in finance, but possibly you can make a curve fitting for some financial application.
For the example that you explain, probably, the x can be a “the maximum speed of update” and y can be “return (cost ?? price??)”. Then, you can predict a possible y for a given any arbitrary x.
Sorry I have not a good knowledge in finance, but probably you might be able to find many regression application in Google. Thanks again and have a nice day!!!
Hi,
Thank you so much for your reply. I’m really interested in nonparametric method, so I’m trying to learn this from your website. Now, I have two questions
1. Can you send me a link of an article that you posted on Bandwidth optimization by a cross validation method section (currently the link is not work)
2. Is it possible to apply kernel regression with multiple variable (is it the same as multivariate kernel?).
Thanks again for your reply.
Pongsutti
It’s my pleasure,
1.
This is a very famous article for the cross validation method.
http://www.rochester.edu/College/psc/clarke/405/EfronGong.pdf
2.
If you are interested in a multivariate regression method, please read this post. http://youngmok.com/gaussian-kernel-regression-for-multidimensional-feature-with-matlab-code/
I hope these help you
Your coding helps a lot for my thesis~~~~many thanks.
which method of cross validation do you use in this code ?
Is it K fold, or Hold-Out? It looks like hold-out to me.(poor statistics)
Just need to be sure. Thx
In the attached file, see “Opt_Hyp_Gauss_Ker_Reg.m” Here, a constant, alpha is a ratio to make a new training data set. i.e, if the number of original training data set is N=100, alpha = 0.75, then 75 dataset is used as a sampled training data and 25 dataset is used for validation. I hope it can answer your question.
hello, this is Habib.my 4th year final project is on gaussian process.My supervisor suggested me to understand the meaning of the example code provided in scikit learn.I saw there many kernel used like RBF,white,expSine etc and later merged but as i am a beginner i have zero knowledge about the kernels specially why we use kernel and when to use which kernel? can you help me finding these answers?
You would be able to find a good existing tutorial on a kernel for machine learning. e.g., https://www.youtube.com/watch?v=9IfT8KXX_9c
Good luck!