Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple common coordinate system definition #55

Open
sfmig opened this issue Mar 15, 2023 · 3 comments
Open

Simple common coordinate system definition #55

sfmig opened this issue Mar 15, 2023 · 3 comments
Assignees
Labels
enhancement Optional feature

Comments

@sfmig
Copy link
Collaborator

sfmig commented Mar 15, 2023

Implement a simple method to transform the pose estimation data from every video to a common coordinate system.

Related to #3, though there we discussed a bit more sophisticated options.

@niksirbi
Copy link
Member

I started working on this in the frame-registration branch.

Progress made

  • Convert plotly shapes into Shapely Polygons objects. This is handy because the Polygon objects come with a lot of useful inbuild methods which we don't have to rewrite (e.g. area, check for validity, bounding box, etc).
  • Write functions for homography-based registration: given a set of N corresponding 2D points in a moving image and a reference image, an affine transform can be estimated such that the points match (functions are untested!)
  • The idea was to directly extract the corresponding points from the defined ROIs ("enclosure" is the obvious candidate here). E.g. The points could be the corners of the enclosure.

Worflow idea

  • Define the ROIs very well in one video. Click a button named "Save these ROIs as reference", which will download and save the coordinates on disk. The video is henceforth referred to as the "reference video", and the save ROIs as "reference ROIs".
  • In a different video, define only the enclosure well, and click the "Infer ROIs" button. This will trigger the following:
    • Load saved reference ROIs
    • Extract the 4 corners of the enclosure
    • Compute the transformation matrix based on the loaded 4 points and the 4 corners of the enclosure in the current video
    • Apply the computed transform to all reference ROIs, and bring them into the current video's space.
    • Save the transformation matrix (current video to reference video), maybe in the video's metadata?
    • All the above should happen nearly instantaneously after clicking "infer ROIs"
    • User can still make adjustments to the inferred ROIs before saving them
  • In the dashboard tab, show which videos have transforms computed for them (similar to how we show which ones have pose estimation done). For those, the transforms can be applied to the trajectories, to enable group visualisations and analyses.

Blockers

  • The homography-based registration only works with an equal number of points between the moving and reference image. Because of the freeform definition, the enclosure (or any ROI) ends up with multiple vertices, and it's not trivial to simplify them into exactly 4 corner points. I have explored some polygon simplification ideas, but these will break for enclosures of different shapes.

Alternative ideas

  • Convert the ROIs into a binary mask image, and perform image-based registration (instead of point-based). Exploring this could be much easier than solving the issue above with polygons.
  • Don't rely on ROIs at all. Have the user define new points/axes, solely for the purpose of coordinate calibration. This would require adding more UI elements to the ROI tab (so I still prefer the sneaky simplicity of doing the work implicitly based on ROIs).

@niksirbi
Copy link
Member

The above idea kills two birds with one stone:

  • simplifies ROI drawing (user only has to draw 1 ROI per video well, the rest are inferred)
  • Established the pixel coordinates of the "reference video" as the common coordinate system for visualisation and analysis

@niksirbi
Copy link
Member

Perhaps a better alternative to choosing one video as the "reference" is having a precisely defined model of the environment, based on blueprints of the setup. This way everything is guaranteed to be "straight", and the "right way up", and the coordinates will be in world space (more meaningful than pixel coordinates). This would also get rid of the "use these ROIs as reference" step.

The elephant in the room is that our model can only be a 2D (topdown) projection of the 3D environment, but we anyway swallow that bullet (I'm running out of analogies here) since we have 1 top view camera only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Optional feature
Projects
None yet
Development

No branches or pull requests

2 participants