Error in running MBPO for multidimensional action

1 view (last 30 days)
I want to reuse the MBPO code from (Train MBPO Agent to Balance Cart-Pole System) to model a discrete action sace environment. I already developed my environment but I am getting the following error;
Error using reshape , Number of elements must not change. Use [] as one of the size inputs to automatically calculate the appropriate size for that dimension.
I tried to locate the error. In the rlDQNAgent.m file, I have found that based on the number of mini batches I predefined for the MBPO agent, once the episode reaches to NumMiniBatches, the minibatch become non-empty and in the rlDQNAgent.m file executes an if-caluse which caclulates the function gradient. And the error is coming from the calculation of gradient which it seems that somewhere the reshape is getting the incorrect input.
I have stuctured my action set as the coloumn and row dataset and it seems that the problem is not from my environment.
The state space is similar to the cart-pole by changing the ranges and the dimension.
Any help would be highly appreciated.
  2 Comments
Takeshi Takahashi
Takeshi Takahashi on 27 Jun 2023
Hi Sahar,
Can you first train the DQN agent (without MBPO) in your environment to see whether you get the same error?
The issue may come from the critic network in the DQN agent or your environment if you get a similar error. Please double-check both the agent and the environment in that case.
Please contact our support if DQN works, but MBPO doesn't work.
Best,
Takeshi
Sahar Keshavarz
Sahar Keshavarz on 7 Jul 2023
Dear Takeshi,
Thank you for the valuable comment.
I have tried to focus on the DQN agent, and unfortunately, I still have the same issue. The critic is created, checked by random observation, created agent, checked agent with random observation, and finally, during training, it passes without any problem up to the Size of the mini-batch episode number. Then I get the same error.
How can I relate that to my environment? If I have the same reward for all minibatch paricles, can this cause this issue?
I appreciate your support.
Best Regards,
Sahar

Sign in to comment.

Answers (1)

Shivansh
Shivansh on 5 Oct 2023
Hi Sahar,
I understand that you are encountering an error related to the reshape function in the MBPO code when trying to model a discrete action space environment. The error message suggests that the number of elements in the reshaped array is changing, which is causing the issue.
Based on your description, it appears that the problem lies in the calculation of the function gradient in the rlDQNAgent.m file. The reshape function is likely receiving incorrect input somewhere in this process.
If you are encountering the same error in the DQN agent during training, even after verifying the critic network and agent creation, it is possible that the issue is related to your environment, or the way rewards are handled.
Having the same reward for all mini-batch particles should not directly cause the reshape error. However, it is crucial to ensure that your reward calculations and handling are correct.
To resolve this issue, you can try the following steps:
  1. Verify the dimensions of your action set: Ensure that the dimensions of your action set match the expected inputs of the reshape operation. If you have structured your action set as a column and row dataset, make sure it is properly shaped and aligned with the reshape operation requirements.
  2. Reviewing the implementation of the function gradient calculation: Review the code responsible for calculating the function gradient in the rlDQNAgent.m file. Check if the reshape operation is being applied to the correct variables and if the dimensions are compatible.
  3. Reviewing the compatibility of the state space and action space: Ensure that the dimensions and ranges of your state space and action space are consistent with the requirements of the MBPO code.
If you have gone through these steps and are still unable to identify the source of the reshape error, it might be helpful to provide more specific details about your environment, the code you are using, and any relevant error messages.
Hope it helps!

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!