Skip to content

fix: redact pending primary email before retirement delete#229

Open
ktyagiapphelix2u wants to merge 3 commits intorelease-ulmofrom
ktyagi/primary
Open

fix: redact pending primary email before retirement delete#229
ktyagiapphelix2u wants to merge 3 commits intorelease-ulmofrom
ktyagi/primary

Conversation

@ktyagiapphelix2u
Copy link
Copy Markdown

Summary

This change addresses a privacy issue in the retirement flow for users who have a pending primary email change.

Problem
When a user retires with an active row in student_pendingemailchange, LMS deletes that row.
Because Fivetran uses soft delete behavior, the row remains in Snowflake with _fivetran_deleted = true.
Before this fix, the soft-deleted Snowflake row could still contain raw pending email and activation key values.

Root Cause
Retirement path deleted PendingEmailChange directly without redacting sensitive fields first.

What Changed

Added a model helper to redact PendingEmailChange fields for a user before deletion.
Updated retirement flow to call redact first, then delete.
Added tests to verify redaction behavior and ordering.
Updated inline comments and PII annotation text to explicitly state redacted then deleted behavior.

Behavior Before
User retires with pending primary email
LMS deletes pending row
Snowflake soft-deleted row may still show raw pending email

Behavior After
User retires with pending primary email
LMS first redacts pending row values
LMS deletes row
Snowflake soft-deleted row contains redacted values only

Ticket & Reference

https://2u-internal.atlassian.net/browse/BOMS-499

@ktyagiapphelix2u ktyagiapphelix2u marked this pull request as ready for review April 14, 2026 12:49
Copilot AI review requested due to automatic review settings April 14, 2026 12:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a privacy gap in the account retirement flow by ensuring PendingEmailChange data is redacted before deletion, preventing downstream soft-deleted replicas (e.g., Snowflake via Fivetran) from retaining raw pending email and activation key values.

Changes:

  • Added PendingEmailChange.redact_pending_email_by_user_value(...) to overwrite new_email and activation_key prior to removal.
  • Updated the retirement endpoint flow to call redaction before delete_by_user_value.
  • Added/updated tests to verify redaction behavior and that redaction occurs before deletion.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
openedx/core/djangoapps/user_api/accounts/views.py Calls the new redaction helper before deleting PendingEmailChange records during retirement.
openedx/core/djangoapps/user_api/accounts/tests/test_retirement_views.py Adds assertions and a dedicated test to validate redaction happens before deletion in the retirement flow.
common/djangoapps/student/tests/test_models.py Adds unit tests for the new redaction helper’s behavior and no-op behavior when no record exists.
common/djangoapps/student/models/user.py Implements the new redaction helper and updates the PII annotation to reflect “redacted then deleted”.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

for record in records_matching_user_value:
record.new_email = get_retired_email_by_email(record.new_email)
record.activation_key = uuid.uuid4().hex
record.save(update_fields=['new_email', 'activation_key'])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktyagiapphelix2u, Do you need to save activation_key?

# Retire misc. models that may contain PII of this user.
# Redact pending primary email fields before delete because
# downstream replication can preserve soft-deleted snapshots.
PendingEmailChange.redact_pending_email_by_user_value(user, field="user")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktyagiapphelix2u , Although this may not matter much but reaction value is user here, can you please confirm?

Copilot AI review requested due to automatic review settings April 16, 2026 10:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread common/djangoapps/student/tests/test_models.py
Comment on lines +1521 to +1526
pending_email_before_retirement = PendingEmailChange.objects.get(user=self.test_user).new_email
expected_retired_pending_email = get_retired_email_by_email(pending_email_before_retirement)

def _assert_redacted_then_delete(value, field):
pending_record = PendingEmailChange.objects.get(user=self.test_user)
assert pending_record.new_email == expected_retired_pending_email
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering test only verifies new_email is redacted before delete_by_user_value runs. Since the motivating issue includes activation_key retention in soft-deleted replicas, extend the assertion here to verify activation_key is also redacted/changed before deletion once the model helper is updated.

Suggested change
pending_email_before_retirement = PendingEmailChange.objects.get(user=self.test_user).new_email
expected_retired_pending_email = get_retired_email_by_email(pending_email_before_retirement)
def _assert_redacted_then_delete(value, field):
pending_record = PendingEmailChange.objects.get(user=self.test_user)
assert pending_record.new_email == expected_retired_pending_email
pending_email_record = PendingEmailChange.objects.get(user=self.test_user)
pending_email_before_retirement = pending_email_record.new_email
activation_key_before_retirement = pending_email_record.activation_key
expected_retired_pending_email = get_retired_email_by_email(pending_email_before_retirement)
def _assert_redacted_then_delete(value, field):
pending_record = PendingEmailChange.objects.get(user=self.test_user)
assert pending_record.new_email == expected_retired_pending_email
assert pending_record.activation_key != activation_key_before_retirement

Copilot uses AI. Check for mistakes.
Comment on lines +914 to +932
@classmethod
def redact_pending_email_by_user_value(cls, value, field):
"""
Redact pending email change fields for records matching ``field=value``.

This method is intended for retirement flows where downstream replication
may keep soft-deleted snapshots of these rows.
"""
filter_kwargs = {field: value}
records_matching_user_value = cls.objects.filter(**filter_kwargs)

if not records_matching_user_value.exists():
return False

for record in records_matching_user_value:
record.new_email = get_retired_email_by_email(record.new_email)
record.save(update_fields=['new_email'])

return True
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redact_pending_email_by_user_value only redacts new_email, but leaves activation_key unchanged. The PR description/incident explains Snowflake soft-deleted snapshots can retain both pending email and activation key values, so keeping activation_key intact still leaks a sensitive token. Suggest redacting activation_key as well (e.g., replace with a new random/unique value to satisfy the unique=True constraint) and update the method/docstring accordingly.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants