Clear Filters
Clear Filters

Optimize RL Agent for a DC-Motor Speed control

32 views (last 30 days)
Hello together
I am trying to replace a PI controller with a RL agent to achieve a simple speed control of a motor (at the moment without current control). I have managed so far that the RL agent behaves like a P-controller. It keeps its set speed well and can also correct it well and quickly in case of a step. However, it still has an error of 10-200 rpm (depends on the specified target speed). I am currently observing the current rpm, the error and the integrated error.
I punish it linearly at an error and reward it, from the point when it gets closer than 50 rpm to the desired rpm.
In the graph I have simulated a brake from the 2nd second onwards. The goal would be to get to the desired speed despite the brake. Unfortunately, an error remains present with a swing.
I slowly do not know what I could change to teach the RL agent a robust PI controller behavior and ask for possible suggestions. As a template for the actor and critic I use the example of the water tank.
Another problem is that the agent so far could learn the behavior only for positive speeds. Teaching it to behave in the negative range the same as in the positive range simply with the negative voltage has not worked yet.
Thanks for a possible answer.
  2 Comments
madhav
madhav on 7 Nov 2023
Hi Franz,
were you able to control the speed now.If you had done pls share the code for my reference

Sign in to comment.

Answers (1)

Yash Sharma
Yash Sharma on 20 Oct 2023
Hi Franz,
I understand that you want to replace the PI controller with an RL (Reinforcement Learning) agent and would like to increase the accuracy of the system. For achieving the same, you can consider the following:
  • ·Adjust the reward function: Instead of punishing the agent linearly for errors, you can use a reward function that penalizes larger errors more heavily, such as a quadratic or exponential penalty. This can help the agent prioritize reducing the error more effectively.
  • Experiment with different network architectures: Try experimenting with different architectures, such as increasing the depth or width of the neural networks used for the actor and critic. This can provide the agent with more capacity to learn complex control strategies.
  • You can try adjusting the exploration rate or using different exploration strategies, such as epsilon-greedy or noise-based exploration. This can allow the agent to explore a wider range of actions and potentially discover better control strategies.
  • Explore different reward structures: In addition to the error, consider incorporating other factors into the reward function. For example, you can include a term that rewards the agent for maintaining a stable and smooth response, such as penalizing large changes in control output. This can encourage the agent to learn a more robust and stable control strategy.
  • Adjust hyperparameters: Hyperparameters, such as learning rate, discount factor, and exploration rate decay, can significantly impact the learning process. Experiment with different values for these hyperparameters to find the ones that work best for your specific problem.
Please find links to below documentation which I believe will help you for further reference:
Hope this helps!

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!