-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PROD-2700 Implement Soft Delete for PrivacyRequests #5321
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
revision = "75bb9ee843f5" | ||
down_revision = "9de4bb76307a" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: remember to update before merging
def get_privacy_request_or_error( | ||
db: Session, privacy_request_id: str | ||
db: Session, privacy_request_id: str, error_if_deleted: Optional[bool] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the privacy request is deleted, we error by default, unless the error_if_deleted
argument is set to False
@@ -434,6 +443,7 @@ def _filter_privacy_request_queryset( | |||
external_id: Optional[str] = None, | |||
action_type: Optional[ActionType] = None, | |||
include_consent_webhook_requests: Optional[bool] = False, | |||
include_deleted_requests: Optional[bool] = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, by default we will filter out deleted requests, unless explicitly included
get_privacy_request_or_error(db, privacy_request_id) | ||
get_privacy_request_or_error(db, privacy_request_id, error_if_deleted=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I allow get
operations on the deleted privacy request , but not update ones. TBH there weren't any specific requirements regarding this on the ticket, so I used my best judgement. Any feedback / opinions are welcome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great thing to run by Michael early on! There's a case to be made either way.
Maybe I would strive for consistency here:get_request_status_logs
, view_uploaded_manual_webhook_data
, view_uploaded_erasure_manual_webhook_data
, and get_individual_privacy_request_tasks
return very specific details about potentially deleted privacy requests by default.
But the search endpoint which is the place to get primary privacy request details does not return deleted privacy requests by default. This latter detail tips me towards wanting to hide everything by default or add query params to the GET ones so the default is "don't show" instead of "show"
self.save(db) | ||
# TODO: do we need to remove cache keys? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the question is because the "real" delete
method (defined right above) does the following:
cache: FidesopsRedis = get_cache()
all_keys = get_all_cache_keys_for_privacy_request(privacy_request_id=self.id)
for key in all_keys:
cache.delete(key)
for provided_identity in self.provided_identities: # type: ignore[attr-defined]
provided_identity.delete(db=db)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great question, I wouldn't yet since you're still allowing these privacy requests to come up in search with certain query params.
.filter(PrivacyRequest.status == PrivacyRequestStatus.awaiting_email_send) | ||
.filter(PrivacyRequest.deleted_at.is_(None)) | ||
.order_by(PrivacyRequest.created_at.asc()) # oldest first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, there were no specfic requirements here, but I think it makes sense not to send emails for deleted requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smart
if privacy_request.deleted_at is not None: | ||
logger.info( | ||
"Terminating privacy request {}: request deleted.", privacy_request.id | ||
) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, I didn't think it made sense to run an access request for a deleted privacy reqyest
# FIXME: don't skip this | ||
@pytest.mark.skip("temporary cause it hangs?") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Un related to this PR, see #5301 . I've skipped it just so that this file would run on the CI. I can remove the skip before merging
fides Run #10085
Run Properties:
|
Project |
fides
|
Branch Review |
refs/pull/5321/merge
|
Run status |
Passed #10085
|
Run duration | 00m 38s |
Commit |
306a46123a ℹ️: Merge f465edfc8ebf12d0c2fe4fd18c4ce7d9cfff9c39 into 0a0ea4f400ca6d1e10ad3b154f90...
|
Committer | erosselli |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
4
|
Upgrade your plan to view test results. | |
View all changes introduced in this branch ↗︎ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to call out that I tried to filter out the deleted requests from the places where it made sense, but also I don't know if I found all the places where we get a list of privacy requests. Wanted to point it out to get some extra eyes around that if possible
starting review - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good but let's step back for a second and clarify with product about expected behavior around "soft deletes". What should be able to be retrieved about a privacy request once it has been soft deleted?
@@ -2441,3 +2488,80 @@ def request_task_async_callback( | |||
queue_request_task(request_task, privacy_request_proceed=True) | |||
|
|||
return {"task_queued": True} | |||
|
|||
|
|||
@router.delete( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't a hard DELETE and it shows up under GET requests, I'd consider a different http method, maybe patch?
op.create_foreign_key( | ||
"privacyrequest_deleted_at_key", | ||
"privacyrequest", | ||
"fidesuser", | ||
["deleted_by"], | ||
["id"], | ||
ondelete="SET NULL", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't that dangerous of an operation, adding a nullable FK, but it could prevent privacy requests from being written while the table is being scanned? So that might be something we need to call out. If we're concerned about it, we could also add the constraint as not valid, and then validate the constraint in the other. The privacy request table can be large https://squawkhq.com/docs/adding-foreign-key-constraint/
Alternatively, we might consider removing the FK constraint. This has an added benefit that we can track if the root user made the change, which is not technically a real user. This would be similar to AuditLog.user_id instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah that's a good idea I think, it may be overkill to have the FK constraint anyways -- I was following what was done for the approved_by column (I think), but it could be argued that this is a different use case anyways
continue | ||
|
||
privacy_request.soft_delete(db, client.user_id) | ||
succeeded.append(privacy_request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's nice to get to re-use the BulkPostPrivacyRequests
schema, since the single delete API endpoint returns a 204, this feels a little jarring to return the whole object.
Perhaps we create a new schema, where the "succeeded" list has just the privacy request id, deleted_at, and deleted_by? or similar
self.save(db) | ||
# TODO: do we need to remove cache keys? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great question, I wouldn't yet since you're still allowing these privacy requests to come up in search with certain query params.
privacy_request.soft_delete(db, user_id) | ||
|
||
|
||
@router.post( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe patch
here as well? what do you think?
get_privacy_request_or_error(db, privacy_request_id) | ||
get_privacy_request_or_error(db, privacy_request_id, error_if_deleted=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great thing to run by Michael early on! There's a case to be made either way.
Maybe I would strive for consistency here:get_request_status_logs
, view_uploaded_manual_webhook_data
, view_uploaded_erasure_manual_webhook_data
, and get_individual_privacy_request_tasks
return very specific details about potentially deleted privacy requests by default.
But the search endpoint which is the place to get primary privacy request details does not return deleted privacy requests by default. This latter detail tips me towards wanting to hide everything by default or add query params to the GET ones so the default is "don't show" instead of "show"
.filter(PrivacyRequest.status == PrivacyRequestStatus.awaiting_email_send) | ||
.filter(PrivacyRequest.deleted_at.is_(None)) | ||
.order_by(PrivacyRequest.created_at.asc()) # oldest first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smart
Closes PROD-2700
Description Of Changes
Adds support for soft deleting
PrivacyRequests
. The deletion is defined by the new columndeleted_at
, and we also stored who deleted it in thedeleted_by
column. A newDELETE /privacy-request/id
endpoint is added in order to allow soft deleting requests, as well as an endpoint to bulk delete multiple requests at the same timePOST /privacy-request/bulk/delete
.Code Changes
deleted_at
anddeleted_by
toPrivacyRequest
, both nullableDELETE /privacy-request/id
and for bulk deleting themPOST /privacy-request/bulk/delete
include_deleted_requests
, that isFalse
by defaultSteps to Confirm
Error cases
Trying to do anything that modifies the deleted privacy request should cause a 422 error. Some examples:
Pre-Merge Checklist
CHANGELOG.md
main
downgrade()
migration is correct and works