Conversation
80d59e2 to
c34deac
Compare
Does whatever we're using to generate the min-snap trajectories also take into account the collision avoidance cost? I don't get how using the control cost for a min-snap trajectory helps us here. Also, on the note of track association, if we figure it out I don't think we need RL, we can use graph of convex sets: https://underactuated.mit.edu/trajopt.html#example9. And for doing track association it might be worth looking into methods for lidar object detection because that's basically the same problem as track association for the particle filter. IDK how hard you expect doing this with RL to be, but if you wanna consider more options before going ahead it might be worth looking into. |
No, but the polynomial trajectory will be used to compute the collision cost, so as long as the min-snap interpolation roughly represents the real-life path of the drone if given those granular waypoints, it should be fine. I.e., the model will learn to work around the interpolator.
I've only skimmed the chapter, but these methods only works for static objects, right? If so, it would require us to both do track association and form convex shapes that for the trajectory to optimize around.
I'll take a look, but I think the most promising thing to do would be to use a NN. Do note, though that lidar and particle filtering result in substantially different point distributions.
I don't anticipate the RL being more difficult than setting up these optimizations, especially considering the fact that they're even further outside of my wheelhouse. |
|
Looks good to me, worried a bit about computational effort since we may want some form of MPC to run (very simple local horizon planning most likely) since when going off track it may be more optimal to regenerate the trajectory from that point rather than make a maneuver to get back on the original track. Maybe we can characterize the vehicle dynamics into the cost functions (ie inertial matrix, velocity and accel hard constraints (not sure if you had hard constraints)) and that can prove a more physically trackable trajectory and avoid MPC. I'm not super familiar with using RL for trajectory generation so it may be much faster than I expect. |
The training might be a little bit computationally involved, but the actor (which generates the waypoints) will be a relatively small neural network which should be much smaller than YOLO. I wouldn't expect generating a whole trajectory with many points to take more time than one image detection. The plan is to continuously regenerate the trajectory from the current position to the destination.
Yes, the "control cost" will be a weighted sum of the velocity and acceleration. I can make the cost non-linear to reflect hard-ish limits, which I think would be better than a cost-cliff. |
Description
stable_baselines3andgymnasium.Technical Details
State Space
Action Space
Reward Function
The reward function is a weighted sum of the below functions.
Trajectory Cost
Collision Cost
Raw Particles
Gaussian TracksDistance/Time to Target
Tasks
gymnasium.Env(representation of environment, i.e., inputs and outputs).Test Plan
Issues
Resources
RL Algorithms
Trajectory Interpolation
Libraries
Point-Cloud Feature Extraction