refactors the code for the environment and agent modules to optimize performance and readability.
- Standardizing the order of arguments in solar step function calls.
- Introducing a percentage-based action voltage adjustment in the module
step function to enhance control flexibility.
- Updating PPO and related files to use Generalized Advantage Estimation
(GAE) by default, improving reward shaping and stability.