Analyzing Impact of Sensor ID on Regression Model Output in MATLAB

4 views (last 30 days)
Hello,
I'm currently using the Regression Learner App to create a model that analyses the output of various gas sensors. I achieve this by comparing the voltage output from each sensor against a reference dataset.
Each of my gas sensors is affected by both temperature and humidity. However, each sensor possesses unique characteristics with respect to these two factors.
My model takes the following inputs:
  • Sensor Output
  • Temperature
  • Humidity
  • Sensor ID
I have observed that when the 'Sensor ID' parameter is not incorporated into the regression app, the model tends to generate a more generalized output. On the other hand, when the 'Sensor ID' is included, the model produces a significantly more accurate output because it also accounts for the unique characteristics of each sensor.
I am interested in quantifying the impact of the 'Sensor ID' on the model's signal. Is there a method to achieve this using the Regression Learner App, or perhaps through another approach?
Ideally, understanding the extent of the effect of 'Sensor ID' would enable me to avoid the need for training a new model for each new sensor. Instead, I could simply adjust the input signal to align with the existing model.
I appreciate any guidance you can provide. Thank you in advance.

Answers (1)

Rohit
Rohit on 15 Jun 2023
Hi Dharmesh,
I understand that you want to quantify the impact of 'Sensor ID' on your regression model in the Regression Learner App.
These are some of the approaches you can try to get more understanding of this feature:
  • Train a regression model on your data using the Regression Learner App, including 'Sensor Output', 'Temperature', 'Humidity', and 'Sensor ID' as predictors. Use the Feature Selection option in the Regression Learner App to analyze the importance of each predictor in your model.
  • You can also use the built-in 'Partial Dependence' plot to visualize how each predictor affects the response when all other predictors are held constant. This may give you insight into the interaction between 'Sensor ID' and other predictors.
Based on the results of your analysis, you can decide whether to exclude 'Sensor ID' as a predictor in future models or adjust your input signal accordingly.
Additionally, you can refer to these documentation links to know more about feature importance and partial dependence plot:
  1 Comment
Dharmesh Joshi
Dharmesh Joshi on 19 Jun 2023
Hi Rohit,
Thank you for your response. I've been using the Regression Learner App and found that the Exponential GPR model seems to provide the best results. However, I believe this model is not as easily interpretable compared to simpler ones.
Presently, I'm exploring how we might interpret the effect of different sensor IDs. We have multiple sensors, each generally following a similar pattern but with unique characteristics. For example, the impact temperature or humidity has on a sensor's sensitivity could be unique to each sensor. I suspect these relationships might be linear or exponential.
Therefore, our goal is to use GPR as a more generalized model. If we can comprehend the unique effects of each sensor's characteristics, we can make those elements interpretable across our other platforms. This understanding could also potentially aid in predicting the behaviour of future sensors as we implement them.
Yes, with the Feature Selection, when sensor id (Desitiguishing each sensor is different) is used, R2 becomes about 0.79 from 0.6. It seems the Partial Dependence Plot is taking a long time to load up. Should this be the case?

Sign in to comment.

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!