Dq 665 contract delete#600
Open
shivahanumanthula-atlan wants to merge 5323 commits intoapache:masterfrom
Open
Conversation
MS-625 : Handle ClassCastException in ABAC evaluation
- Add tenant health verification after Temporal workflow completes - Poll Temporal workflow status until completion (60min timeout) - Health checks per tenant: - Connect to tenant vCluster via vcluster platform - Verify all atlas pods exist and containers are ready - Verify image pattern (repo-ring) and tag match expected - Port-forward to atlas service and check /api/atlas/admin/status - Post health check results as PR comment - Setup kubectl, vCluster CLI, and improve VPN connection handling - Use GLOBALPROTECT_PORTAL_URL from vars instead of hardcoded value Co-authored-by: Cursor <cursoragent@cursor.com>
feat: Add e2e health checks to cohort release workflow
GITHUB_PATH changes only apply to subsequent steps, not the current step. Use full path $HOME/.temporalio/bin/temporal for version check. Co-authored-by: Cursor <cursoragent@cursor.com>
fix: Use full path for temporal CLI in setup step
- Configure routing for 172.17.0.0/16 to VPN interface (Docker conflict) - Increase VPN stabilization wait to 20s - Verify vCluster Platform connectivity before login - Match smoke test VPN configuration from maven.yml Co-authored-by: Cursor <cursoragent@cursor.com>
fix: Add routing fix for vCluster Platform connectivity
Adds publishAsyncIngestionEvent to classification, relationship, and business metadata endpoints in EntityMutationService, EntityREST, and RelationshipREST. Includes unit tests for all new event types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tants Centralises all hardcoded event type strings into a single constants class so consumers can reference them without string duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wait up to 5 minutes for pods to have correct image after ArgoCD sync - Poll every 30 seconds for pod readiness and image tag - Accounts for StatefulSet rolling update time after ArgoCD applies manifest - Prevents false failures when checking immediately after Temporal completes Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Add filter-level gate in ActiveServerFilter that returns 503 for admin/repair write endpoints while ENABLE_ASYNC_INGESTION flag is active. This prevents graph state mutations that bypass the async Kafka producer pipeline. Blocked: 24 endpoints across admin, repair, entity repair, and migration repair. All GET requests, config APIs, and normal CRUD operations remain unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Utility test that generates sample payloads to a file — should only be run manually when needed, not as part of regular test suites. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All mutations documented
fix integration tests
- Retry entire rollout check up to 2 times if timeout (total max: 30 min) - Each attempt has 15 minute timeout with 30s poll interval - Rename variables with _SECS suffix for clarity - Handles cases where ArgoCD sync takes longer than expected Co-authored-by: Cursor <cursoragent@cursor.com>
Remove VPN, vCluster, and kubectl setup from PR label release workflow. Health verification (ArgoCD sync + pod rollout) will be handled by Temporal workflows, eliminating idle billing during long waits. GitHub Actions now only triggers Temporal and polls for completion. Co-authored-by: Cursor <cursoragent@cursor.com>
Replace paths-ignore with dorny/paths-filter so the job always runs and reports a status. This fixes the issue where required checks remain in "Waiting" state when the workflow is skipped due to paths-ignore. When only docs/config files change, the job runs quickly and reports success without running the actual tests. Co-authored-by: Cursor <cursoragent@cursor.com>
- Update manual-cohort-cleanup to support multi-label detection, allow_open_pr for selective rollback, and auto-default source to github when path is provided - Update DEVELOPER-RUNBOOK with accurate tenant counts, release gates (SHA+branch validation), new gotchas, and GA/rollback flows - Update implementation doc with correct atlas ring cohort names, bugs 4-6, image override validation, parallelism details, and expanded developer guide with selective rollback instructions Made-with: Cursor
…6497) * ms-802: Trim whitespaces in name attribute * ms-802: Refactor code * ms-802: Remove custom image
… soft-deleted (#6510) When an entity is soft-deleted, tagDAO.deleteTags() was only called for HARD/PURGE deletes, leaving tag rows in tags_by_id with is_deleted=false. This caused orphaned propagated tag rows with: - is_deleted=false (should be true) - asset_metadata=null (entity no longer resolvable) - tag_meta_json.entityStatus=DELETED The fix removes the delete-type guard so tags are always soft-deleted in Cassandra regardless of whether the entity delete is soft, hard, or purge. Downstream propagation cleanup via async tasks is unaffected. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… scaling (#6517) ThreadPoolExecutor only creates new threads beyond corePoolSize when the queue is FULL. With queue=200, the pool stays at ~40 threads while 200 requests queue up before any expansion toward maxThreads=400 happens. This caused thread pool exhaustion on nasdaq-vc: all 3 pods hit 40 threads with 60-109 requests queued, but the pool refused to grow because the queue (capacity 200) wasn't full yet. Queue=10 means the pool starts creating new threads after just 10 requests are queued, matching the fast-scaling behavior of Jetty's native QueuedThreadPool. Changed in: - AtlasConfiguration.java: default 100 -> 10 - helm/atlas configmap: 200 -> 10 - helm/atlas-read configmap: 200 -> 10 - helm/atlas configmap-leangraph: 200 -> 10 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: (ms-903) Emit parent ENTITY_UPDATE on sub-asset soft delete * fix: (ms-903) Adding integration test cases
…or mixed index registration failures (#6436) * feat: Add index health Prometheus metrics for mixed index audit * feat: Adding tenant and pod filter * fix: Micrometer gauges bind to object references at registration time. If the object is garbage collected, the gauge returns NaN. The gauge must be registered once, using the exact same objects that get updated later. * fix: Removing custom branch * fix: Adding java doc * fix: Addressing the cursor bot reviews * feat: Removing feature branch * feat: Add self-healing for missing mixed index property keys (Phase 2) * fix: Retry RepairIndex bean lookup for async reindex during startup * fix: Retry RepairIndex bean lookup for async reindex during startup * fix: Retry RepairIndex bean lookup for async reindex during startup * chore: Remove non-Phase-2 files from PR Revert maven.yml to master and remove local dev files (grafana dashboard, docker-compose, synonym.txt, test file) that are not part of the Phase 2 self-healing changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: adding feature branch * fix: adding feature branch * fix: adding feature branch * trigger build * chore: trigger CI build * fix: Fixing review comments * fix: Fixing review comments * fix: Fixing review comments * fix: removing feature branch * feat: Self-healing for missing mixed index property keys (Phase 2a) * fix: Now isStringField for the repair matches the original createIndexForAttribute logic exactly: - Original: isStringField = true only when primitiveClassType == String.class && IndexType.STRING.equals(indexType) - Repair: repairIsStringField = (primitiveClass == String.class) && isStringField * fix: Increased test timeout from 60s to 120s — accommodates ES stabilization after schema repairs on fresh environments * fix: don't self-heal if the property key was never created in the first place (fresh environment). Only self-heal if the property key EXISTS in the schema but is NOT in the mixed index. That's the actual production failure mode --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…6485) * perf(MS-752): optimize getAccessControlEntity to avoid O(N×K) graph reads For a Persona with K existing policies, the old code called toAtlasEntityWithExtInfo() once per new policy, triggering mapRelationshipAttributes (O(K × attrs) JanusGraph reads) and then looping K existing policies calling toAtlasEntity() each time. For N=20 new policies and K=500 existing ones this caused ~10,000 redundant graph reads, producing >30s latency for CME Group. Fix: - AuthPolicyPreProcessor: promote noRelAttrRetriever to a constructor field (EntityGraphRetriever with ignoreRelationshipAttr=true) and rewrite getAccessControlEntity() to load the Persona vertex once via entityRetriever.getEntityVertex(), load scalar attrs only via noRelAttrRetriever.toAtlasEntity(), then traverse policy edges directly via GraphHelper.getActiveCollectionElementsUsingRelationship() — skipping mapRelationshipAttributes entirely. - AccessControlUtils.objectToEntityList(): add null-guard around getReferredEntities().keySet() to prevent NPE when the optimised path builds AtlasEntityWithExtInfo manually with zero existing policies (referredEntities left null by the AtlasEntityWithExtInfo constructor). Also adds AuthPolicyPreProcessorLatencyTest (10 unit tests) documenting the root-cause call-count behaviour and guarding against NPE regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger Java CI build for hitesh/ms-752-fix branch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: retrigger Java CI build for hitesh/ms-752-fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: touch test file to trigger Java CI image build Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: add hitesh-ms-752-fix branch to Java CI image build Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: retrigger build on hitesh-ms-752-fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: touch source to trigger image build on hitesh-ms-752-fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * perf(MS-752): cache ES index name, connection entity, and reuse preprocessor per batch Fix A — ESAliasStore: cache the resolved physical index name for VERTEX_INDEX_NAME. The alias→index mapping is static at runtime; one HTTP GET per ESAliasStore instance is sufficient. Previously this GET fired once per new policy in a bulk request. Fix C — AuthPolicyPreProcessor: add a per-instance Map<String,AtlasEntity> connection cache in validateConnectionAdmin(). All N policies in a bulk create typically target the same Connection, so the graph read now happens once instead of N times. Fix D — AtlasEntityStoreV2.executePreProcessor(): build a type→preprocessors map once per request (computeIfAbsent) instead of calling getPreProcessor() (which creates a fresh instance) for every entity. This is the prerequisite for Fix A and Fix C to be effective across entities in the same bulk request. Together with the Fix B already on this branch, the per-request I/O for a bulk create of N=20 policies against a Persona with K=500 existing policies drops to: Persona graph reads : N → 1 Policy graph reads : N×K → K (Fix B) Connection graph reads: N → 1 (Fix C) ES index-name GETs : N → 1 (Fix A + Fix D) ES alias PUTs : N (unchanged; Fix E deferred) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Revert "perf(MS-752): cache ES index name, connection entity, and reuse preprocessor per batch" This reverts commit 023c447. * fix: address review issues in getAccessControlEntity edge traversal - Use AtlasRelationshipEdgeDirection (IN/OUT/BOTH) to pick the correct policy vertex from each edge, matching the EntityGraphRetriever pattern and avoiding the fragile getIdForDisplay() string comparison - Use specific relationship type name key ("access_control_policies") when looking up the policiesAttr from the relationshipAttributes map, instead of blindly taking iterator().next() which may pick the wrong entry if multiple relationship types define the same attribute name Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add /atlas-context skill — codebase awareness and learned constraints Living document encoding architecture, I/O cost model, constraints, review checklist, and accumulated lessons from MS-752 (Fix B, edge direction bug, relationship map key, bulk policy perf). Includes self-update protocol so Claude appends new lessons after each review/incident. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: exercise Fix B edge-traversal path in AuthPolicyPreProcessorLatencyTest Adds getAccessControlEntity_traversesEdgesForExistingPolicies() to cover Steps 2-3 of the Fix B optimisation — the GraphHelper edge loop that was previously unreachable because mockPersonaFetch() stubbed typeRegistry to return null. The new test: - Provides a real AtlasEntityType mock with a valid policiesAttrMap keyed by the specific relationship type "access_control_policies" - Mocks policiesAttr.getRelationshipEdgeDirection() = IN so the correct vertex (edge.getOutVertex()) is selected as the policy vertex - Uses mockStatic(GraphHelper.class) to return K mock edges from getActiveCollectionElementsUsingRelationship() - Asserts noRelAttrRetriever.toAtlasEntity(vertex) is called K times and that all K policy GUIDs are registered in ret.getReferredEntities() Closes the test gap flagged in the PR review (issue-3). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: guard setRelationshipAttribute(REL_ATTR_POLICIES) inside policiesAttr != null block When typeRegistry.getEntityTypeByName() returns null, policiesAttr is null and the edge traversal is skipped. Previously setRelationshipAttribute was called unconditionally with an empty policyObjectIds, causing ESAliasStore to rebuild the Persona's ES alias with zero existing policies — silently wiping all K existing filter clauses from janusgraph_vertex_index. Move setRelationshipAttribute inside the if-block and add a warn log for the fallback case so the failure is observable in logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * undo * docs: add /zerograph-deployer Claude skill Step-by-step skill covering health check, connectivity test, trigger (async/sync/dry-run), status polling, input params, CI pipeline, and common ops for the mothership-zerograph deployer agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: clarify masterBehavior_* tests as post-fix regression guards Remove stale "switch assertion below" instructions and "BUG DOCUMENTATION" framing. The assertions are already set to the post-Fix-B values (times(0)) — update Javadoc to reflect that these are regression guards, not master baseline documentation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * remove the skill zg-deployer, not patrt of this pr --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The default per_page for listReviews is 30. PRs with many review comments/iterations can exceed this, causing recent approvals on the current HEAD to be missed. This caused ring releases to fail with "No approvals found" even when approvals existed on the current HEAD SHA. Made-with: Cursor
…butes (#6531) * feat: Phase 2b — controlled reindex for self-healed mixed index attributes * fix: adding the dedup on the producer side and removing the redis dependency * fix: addressing review comments
* Atlas testing harness (#6471) * added fix for deletion for <10K assets ES sync * BulkPurgeService with crash recovery, concurrency safety, and ES cleanup * removed unnecessary configs * hardcoded the self graph loading to make new object * made config fetching synchronized across threads * fixed cancel API in case of cancel+ cancel or cancel+purgeTrigger * forced cancel should be cleaned post cancellation * added the case fix for cancelPurge-> retrigger-> cancel * enforced worker queue to be called only on force cancel * addressed comments, moved cleanup to finally * added test harness suit for all endpoints first commit * added tests for harnesS * fixed tests and data for harness * fixed tests, added more tests related to deletes * fixed time for es sync along with flagging * added kafka helpers and more test on entity behaviour * added lineage and propagation tests * added tests for lineages and all * added tests for lineages and all * added the pending tests for no-op * fixed audit search index time by adding correct filters * fixed typedef, lineage, classification and busiess metadata tests * fixed glossary tests as per the UI * checkpoint benchmark1 * increased timeouts for searches * fixed lineage specific tests * updated suite * added tests for evaluator, accessor, bulk unique attr (#6345) * Testing harness extended (#6370) * added tests for evaluator, accessor, bulk unique attr * added glossary and attribute test * Testing harness extended (#6401) * chore: remove Tags V1 dead code from propagation tasks and ClassificationAssociator (#6305) - ClassificationPropagationTasks: remove isTagV2Enabled() branches in Add, UpdateText, Delete, and RefreshPropagation tasks. V2 path (Cassandra) is now always taken. Also removed unused previousRestrictPropagation* local vars from Add.run(). - ClassificationAssociator: remove V1-only updateClassificationText(null, allVertices) guarded by !isTagV2Enabled(). Remove now-unused DynamicConfigStore import. Part 1 of Tags V1 cleanup. Refs: MS-751 Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * Fix AtlasEntityHeader constructors to preserve docId, vertexId, and superTypeNames (#6304) The copy constructor AtlasEntityHeader(AtlasEntityHeader) and the entity-based constructor AtlasEntityHeader(AtlasEntity) were not copying docId, vertexId, or superTypeNames fields. When these constructors are used in the notification pipeline (e.g., convertDiffEntityToHeader), headers with null docId propagate to Elasticsearch, causing ES documents to lose their document sync references. This results in assets appearing as "not found" in the UI. Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * bulk purge released forced refresh (#6323) * fix(icarus): dynamic JVM options for memory management and adjust CPU limits (#6330) * Add atlas_vertex_index ES alias for janusgraph_vertex_index on startup (#6336) Create a stable ES alias "atlas_vertex_index" pointing to the actual vertex index (e.g. janusgraph_vertex_index) during startup. This allows consumers to use a backend-agnostic index name. The alias is created once (idempotent check on every startup) and is best-effort — failures do not block Atlas startup. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: allow apostrophe in link URL validation regex (#6342) * fix: allow apostrophe in link URL validation regex Add unit tests for LinkPreProcessor URL validation. Add branch to CI for testing. * remove custom branch * added tests for evaluator, accessor, bulk unique attr * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS (#6347) * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS Remove multiarch preferredDuringSchedulingIgnoredDuringExecution blocks from the nodeAffinity section of the cassandra-online-dc StatefulSet in atlas-read. These blocks caused both soft (preferred) and hard (required) affinity rules to coexist when multiarch was enabled, leading to mixed affinity behavior. The STS now matches the normal atlas cassandra STS which only uses requiredDuringSchedulingIgnoredDuringExecution in non-Development/Enterprise deployments. Fixes: MS-803 Co-Authored-By: Claude Code <noreply@anthropic.com> * commit --------- Co-authored-by: Claude Code <noreply@anthropic.com> * added atlas mcp observability skills (#6315) * added atlas mcp skills * Removed hard paths in mcp.json * docs(cohort-release): Add auto-sync check, dynamic redistribution, and release channel filtering (#6366) - Document auto-sync safety check that skips tenants without ArgoCD auto-sync - Add dynamic ring redistribution section (quarterly automation, data sources) - Document release channel filtering (MAIN-BASE, GOLDEN-MAIN-BASE only) - Add release result states explanation (success, partial_success, failed, skipped) - Update tenant counts and asset ranges in runbook - Add gotchas for skipped tenants and release channel exclusions Made-with: Cursor * added glossary and attribute test * feat: reduce icarus memory from 4Gi to 2Gi (#6380) * Switch entity_audits to niofs store type to free page cache for vertex index (#6324) * Switch entity_audits ES index to niofs store type to eliminate page cache contention entity_audits uses the default hybridfs store type, which memory-maps all segment files at index open time. On production clusters, this consumes 19-400GB of virtual address space per node, competing with janusgraph_vertex_index for the OS page cache and degrading search performance. niofs uses Java NIO FileChannel.read() instead of mmap — audit pages only enter the page cache during active queries and are easily evictable, freeing page cache for the vertex index that actually needs it. Changes: - ESBasedAuditRepository: add ensureStoreTypeNiofs() to createSession() startup flow. Uses a marker document (HEAD check) so the close/open migration runs exactly once across all pods and all future deployments (~1ms no-op on subsequent startups). - es-audit-mappings.json: add settings block with store.type=niofs and refresh_interval=60s so new indices are created with niofs from the start. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move writeNiofsMigrationMarker() into finally block per review feedback Only write the marker when both the settings update and index reopen succeed, preventing partial migration state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change (#6381) * test(MS-701): add failing integration test for missing parent UPDATE on sub-asset add When a sub-asset (Process) is added with a relationship to a parent (Table) via bulk createOrUpdate, and the parent's own attributes haven't changed, the parent entity is incorrectly excluded from the UPDATE response and Kafka notifications. This test demonstrates the bug: it creates a Table, then sends a bulk request with the same unchanged Table + a new Process referencing it. The assertion that Table appears as UPDATED will fail until the fix is applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change When a sub-asset is added/deleted/restored via bulk createOrUpdate and the parent entity's own attributes are unchanged, the parent was incorrectly excluded from UPDATE notifications. This happened because RequestContext.recordEntityUpdate() checks entitiesToSkipUpdate, which blocks entities whose attributes didn't change — even when their relationships did change. Fix: Add recordEntityUpdateForRelationshipChange() to RequestContext that bypasses the entitiesToSkipUpdate check (following the existing pattern of recordEntityUpdateForNonRelationshipAttributes). Update all call sites in EntityGraphMapper and DeleteHandlerV1 that record parent entity updates due to relationship edge creation/deletion to use this new method. Affected call sites: - EntityGraphMapper.recordEntityUpdate(vertex) — simple relationship update - EntityGraphMapper.recordEntityUpdate(vertex, ctx, isAdd) — sub-asset add/remove - EntityGraphMapper inverse reference update (line ~1563) - DeleteHandlerV1.deleteEdgeReference() — both relationship and legacy edges Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci(MS-701): add SubAssetAddParentUpdateNotificationTest to CI integration test list The integration-tests.yml workflow uses an explicit -Dtest= list. Without this change, the new test would never run in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): rewrite test to use AtlasInProcessBaseIT instead of Docker The CI integration tests use AtlasInProcessBaseIT (starts Atlas in-process via Jetty with testcontainers for infra). The previous test extended AtlasDockerIntegrationTest which requires a private atlanhq/atlas:test Docker image not available in CI. Rewritten to use AtlasClientV2 API with the same test scenario: create Table, then bulk createOrUpdate with unchanged Table + new Process referencing it, and assert the Table appears as UPDATED in the response. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(MS-701): add Kafka ENTITY_UPDATE notification assertion for parent Table The test now verifies both: 1. REST response: Table appears in updatedEntities (existing) 2. Kafka: ENTITY_UPDATE notification emitted for Table on ATLAS_ENTITIES topic Uses ApplicationProperties to get kafka bootstrap servers (same pattern as AsyncIngestionIntegrationTest). Polls ATLAS_ENTITIES topic filtering by GUID + operationType + eventTime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): Validate build ran on ring branch, not just matching SHA (#6389) The pr-label-release workflow was checking only head_sha when validating builds, allowing releases to proceed using builds from non-ring branches that happened to share the same commit SHA. This caused an incident where ring-ms-864-keycloak-jwks-fix used a build from ms-864-keycloak-jwks-internal-url (both pointing to the same SHA) without any actual build running on the ring branch. Add branch validation for both maven build and integration tests to ensure the workflows actually ran on the expected ring branch. Made-with: Cursor * GOV-667 | Add duplicate policy name validation for Persona entities (#6375) * GOV-667: Validate if policy name exists or not * GOV-667: Removed comments * GOV-667: Added unit tests * GOV-667: allow purposes to have same names * GOV-667: Fix minor issues * GOV_667: only check for persona * GOV-667: Changes to correctly perform unit tests * GOV-667: Resolved review comments * GOVFOUN-235: v1 implementation for Datasets (#6172) * GOVFOUN-235: v1 implementation for Datasets * GOVFOUN-235: normalize datasetType * GOVFOUN-235: implement delete and make Qn immutable * GOVFOUN-235: block updates to element count attr * GOVFOUN-235: allow dataset to be linked to domain * GOVFOUN-235: fix delete type * GOVFOUN-235: Added tests * GOVFOUN-235: Fixed typeDefs * GOVFOUN-235: Fix tests * GOVFOUN-235: fix failing test * GOVFOUN-235: Fix minor big * GOVFOUN-235: allow admins to edit resources * GVOFOUN-235: Enrich dataset info for audit * GOVFOUN-235: Changes after typeDef review * GOVFOUN-235: fix tests * GOVFOUN-235: Resolve reviews * GOVFOUN-235: Reverting previous commit * fix: (MS-609) Improving Task Lifecycl Management in Apps Team Workflows (#6395) * updated tests --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com> * remove unwanted files * removed checks ginore * Testing harness extended (#6470) * chore: remove Tags V1 dead code from propagation tasks and ClassificationAssociator (#6305) - ClassificationPropagationTasks: remove isTagV2Enabled() branches in Add, UpdateText, Delete, and RefreshPropagation tasks. V2 path (Cassandra) is now always taken. Also removed unused previousRestrictPropagation* local vars from Add.run(). - ClassificationAssociator: remove V1-only updateClassificationText(null, allVertices) guarded by !isTagV2Enabled(). Remove now-unused DynamicConfigStore import. Part 1 of Tags V1 cleanup. Refs: MS-751 Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * Fix AtlasEntityHeader constructors to preserve docId, vertexId, and superTypeNames (#6304) The copy constructor AtlasEntityHeader(AtlasEntityHeader) and the entity-based constructor AtlasEntityHeader(AtlasEntity) were not copying docId, vertexId, or superTypeNames fields. When these constructors are used in the notification pipeline (e.g., convertDiffEntityToHeader), headers with null docId propagate to Elasticsearch, causing ES documents to lose their document sync references. This results in assets appearing as "not found" in the UI. Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * bulk purge released forced refresh (#6323) * fix(icarus): dynamic JVM options for memory management and adjust CPU limits (#6330) * Add atlas_vertex_index ES alias for janusgraph_vertex_index on startup (#6336) Create a stable ES alias "atlas_vertex_index" pointing to the actual vertex index (e.g. janusgraph_vertex_index) during startup. This allows consumers to use a backend-agnostic index name. The alias is created once (idempotent check on every startup) and is best-effort — failures do not block Atlas startup. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: allow apostrophe in link URL validation regex (#6342) * fix: allow apostrophe in link URL validation regex Add unit tests for LinkPreProcessor URL validation. Add branch to CI for testing. * remove custom branch * added tests for evaluator, accessor, bulk unique attr * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS (#6347) * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS Remove multiarch preferredDuringSchedulingIgnoredDuringExecution blocks from the nodeAffinity section of the cassandra-online-dc StatefulSet in atlas-read. These blocks caused both soft (preferred) and hard (required) affinity rules to coexist when multiarch was enabled, leading to mixed affinity behavior. The STS now matches the normal atlas cassandra STS which only uses requiredDuringSchedulingIgnoredDuringExecution in non-Development/Enterprise deployments. Fixes: MS-803 Co-Authored-By: Claude Code <noreply@anthropic.com> * commit --------- Co-authored-by: Claude Code <noreply@anthropic.com> * added atlas mcp observability skills (#6315) * added atlas mcp skills * Removed hard paths in mcp.json * docs(cohort-release): Add auto-sync check, dynamic redistribution, and release channel filtering (#6366) - Document auto-sync safety check that skips tenants without ArgoCD auto-sync - Add dynamic ring redistribution section (quarterly automation, data sources) - Document release channel filtering (MAIN-BASE, GOLDEN-MAIN-BASE only) - Add release result states explanation (success, partial_success, failed, skipped) - Update tenant counts and asset ranges in runbook - Add gotchas for skipped tenants and release channel exclusions Made-with: Cursor * added glossary and attribute test * feat: reduce icarus memory from 4Gi to 2Gi (#6380) * Switch entity_audits to niofs store type to free page cache for vertex index (#6324) * Switch entity_audits ES index to niofs store type to eliminate page cache contention entity_audits uses the default hybridfs store type, which memory-maps all segment files at index open time. On production clusters, this consumes 19-400GB of virtual address space per node, competing with janusgraph_vertex_index for the OS page cache and degrading search performance. niofs uses Java NIO FileChannel.read() instead of mmap — audit pages only enter the page cache during active queries and are easily evictable, freeing page cache for the vertex index that actually needs it. Changes: - ESBasedAuditRepository: add ensureStoreTypeNiofs() to createSession() startup flow. Uses a marker document (HEAD check) so the close/open migration runs exactly once across all pods and all future deployments (~1ms no-op on subsequent startups). - es-audit-mappings.json: add settings block with store.type=niofs and refresh_interval=60s so new indices are created with niofs from the start. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move writeNiofsMigrationMarker() into finally block per review feedback Only write the marker when both the settings update and index reopen succeed, preventing partial migration state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change (#6381) * test(MS-701): add failing integration test for missing parent UPDATE on sub-asset add When a sub-asset (Process) is added with a relationship to a parent (Table) via bulk createOrUpdate, and the parent's own attributes haven't changed, the parent entity is incorrectly excluded from the UPDATE response and Kafka notifications. This test demonstrates the bug: it creates a Table, then sends a bulk request with the same unchanged Table + a new Process referencing it. The assertion that Table appears as UPDATED will fail until the fix is applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change When a sub-asset is added/deleted/restored via bulk createOrUpdate and the parent entity's own attributes are unchanged, the parent was incorrectly excluded from UPDATE notifications. This happened because RequestContext.recordEntityUpdate() checks entitiesToSkipUpdate, which blocks entities whose attributes didn't change — even when their relationships did change. Fix: Add recordEntityUpdateForRelationshipChange() to RequestContext that bypasses the entitiesToSkipUpdate check (following the existing pattern of recordEntityUpdateForNonRelationshipAttributes). Update all call sites in EntityGraphMapper and DeleteHandlerV1 that record parent entity updates due to relationship edge creation/deletion to use this new method. Affected call sites: - EntityGraphMapper.recordEntityUpdate(vertex) — simple relationship update - EntityGraphMapper.recordEntityUpdate(vertex, ctx, isAdd) — sub-asset add/remove - EntityGraphMapper inverse reference update (line ~1563) - DeleteHandlerV1.deleteEdgeReference() — both relationship and legacy edges Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci(MS-701): add SubAssetAddParentUpdateNotificationTest to CI integration test list The integration-tests.yml workflow uses an explicit -Dtest= list. Without this change, the new test would never run in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): rewrite test to use AtlasInProcessBaseIT instead of Docker The CI integration tests use AtlasInProcessBaseIT (starts Atlas in-process via Jetty with testcontainers for infra). The previous test extended AtlasDockerIntegrationTest which requires a private atlanhq/atlas:test Docker image not available in CI. Rewritten to use AtlasClientV2 API with the same test scenario: create Table, then bulk createOrUpdate with unchanged Table + new Process referencing it, and assert the Table appears as UPDATED in the response. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(MS-701): add Kafka ENTITY_UPDATE notification assertion for parent Table The test now verifies both: 1. REST response: Table appears in updatedEntities (existing) 2. Kafka: ENTITY_UPDATE notification emitted for Table on ATLAS_ENTITIES topic Uses ApplicationProperties to get kafka bootstrap servers (same pattern as AsyncIngestionIntegrationTest). Polls ATLAS_ENTITIES topic filtering by GUID + operationType + eventTime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): Validate build ran on ring branch, not just matching SHA (#6389) The pr-label-release workflow was checking only head_sha when validating builds, allowing releases to proceed using builds from non-ring branches that happened to share the same commit SHA. This caused an incident where ring-ms-864-keycloak-jwks-fix used a build from ms-864-keycloak-jwks-internal-url (both pointing to the same SHA) without any actual build running on the ring branch. Add branch validation for both maven build and integration tests to ensure the workflows actually ran on the expected ring branch. Made-with: Cursor * GOV-667 | Add duplicate policy name validation for Persona entities (#6375) * GOV-667: Validate if policy name exists or not * GOV-667: Removed comments * GOV-667: Added unit tests * GOV-667: allow purposes to have same names * GOV-667: Fix minor issues * GOV_667: only check for persona * GOV-667: Changes to correctly perform unit tests * GOV-667: Resolved review comments * GOVFOUN-235: v1 implementation for Datasets (#6172) * GOVFOUN-235: v1 implementation for Datasets * GOVFOUN-235: normalize datasetType * GOVFOUN-235: implement delete and make Qn immutable * GOVFOUN-235: block updates to element count attr * GOVFOUN-235: allow dataset to be linked to domain * GOVFOUN-235: fix delete type * GOVFOUN-235: Added tests * GOVFOUN-235: Fixed typeDefs * GOVFOUN-235: Fix tests * GOVFOUN-235: fix failing test * GOVFOUN-235: Fix minor big * GOVFOUN-235: allow admins to edit resources * GVOFOUN-235: Enrich dataset info for audit * GOVFOUN-235: Changes after typeDef review * GOVFOUN-235: fix tests * GOVFOUN-235: Resolve reviews * GOVFOUN-235: Reverting previous commit * fix: (MS-609) Improving Task Lifecycl Management in Apps Team Workflows (#6395) * updated tests * fixed tests --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com> --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com> * Atlas testing harness (#6483) * added fix for deletion for <10K assets ES sync * BulkPurgeService with crash recovery, concurrency safety, and ES cleanup * removed unnecessary configs * hardcoded the self graph loading to make new object * made config fetching synchronized across threads * fixed cancel API in case of cancel+ cancel or cancel+purgeTrigger * forced cancel should be cleaned post cancellation * added the case fix for cancelPurge-> retrigger-> cancel * enforced worker queue to be called only on force cancel * addressed comments, moved cleanup to finally * added test harness suit for all endpoints first commit * added tests for harnesS * fixed tests and data for harness * fixed tests, added more tests related to deletes * fixed time for es sync along with flagging * added kafka helpers and more test on entity behaviour * added lineage and propagation tests * added tests for lineages and all * added tests for lineages and all * added the pending tests for no-op * fixed audit search index time by adding correct filters * fixed typedef, lineage, classification and busiess metadata tests * fixed glossary tests as per the UI * checkpoint benchmark1 * increased timeouts for searches * fixed lineage specific tests * updated suite * added tests for evaluator, accessor, bulk unique attr (#6345) * Testing harness extended (#6370) * added tests for evaluator, accessor, bulk unique attr * added glossary and attribute test * Testing harness extended (#6401) * chore: remove Tags V1 dead code from propagation tasks and ClassificationAssociator (#6305) - ClassificationPropagationTasks: remove isTagV2Enabled() branches in Add, UpdateText, Delete, and RefreshPropagation tasks. V2 path (Cassandra) is now always taken. Also removed unused previousRestrictPropagation* local vars from Add.run(). - ClassificationAssociator: remove V1-only updateClassificationText(null, allVertices) guarded by !isTagV2Enabled(). Remove now-unused DynamicConfigStore import. Part 1 of Tags V1 cleanup. Refs: MS-751 Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * Fix AtlasEntityHeader constructors to preserve docId, vertexId, and superTypeNames (#6304) The copy constructor AtlasEntityHeader(AtlasEntityHeader) and the entity-based constructor AtlasEntityHeader(AtlasEntity) were not copying docId, vertexId, or superTypeNames fields. When these constructors are used in the notification pipeline (e.g., convertDiffEntityToHeader), headers with null docId propagate to Elasticsearch, causing ES documents to lose their document sync references. This results in assets appearing as "not found" in the UI. Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * bulk purge released forced refresh (#6323) * fix(icarus): dynamic JVM options for memory management and adjust CPU limits (#6330) * Add atlas_vertex_index ES alias for janusgraph_vertex_index on startup (#6336) Create a stable ES alias "atlas_vertex_index" pointing to the actual vertex index (e.g. janusgraph_vertex_index) during startup. This allows consumers to use a backend-agnostic index name. The alias is created once (idempotent check on every startup) and is best-effort — failures do not block Atlas startup. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: allow apostrophe in link URL validation regex (#6342) * fix: allow apostrophe in link URL validation regex Add unit tests for LinkPreProcessor URL validation. Add branch to CI for testing. * remove custom branch * added tests for evaluator, accessor, bulk unique attr * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS (#6347) * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS Remove multiarch preferredDuringSchedulingIgnoredDuringExecution blocks from the nodeAffinity section of the cassandra-online-dc StatefulSet in atlas-read. These blocks caused both soft (preferred) and hard (required) affinity rules to coexist when multiarch was enabled, leading to mixed affinity behavior. The STS now matches the normal atlas cassandra STS which only uses requiredDuringSchedulingIgnoredDuringExecution in non-Development/Enterprise deployments. Fixes: MS-803 Co-Authored-By: Claude Code <noreply@anthropic.com> * commit --------- Co-authored-by: Claude Code <noreply@anthropic.com> * added atlas mcp observability skills (#6315) * added atlas mcp skills * Removed hard paths in mcp.json * docs(cohort-release): Add auto-sync check, dynamic redistribution, and release channel filtering (#6366) - Document auto-sync safety check that skips tenants without ArgoCD auto-sync - Add dynamic ring redistribution section (quarterly automation, data sources) - Document release channel filtering (MAIN-BASE, GOLDEN-MAIN-BASE only) - Add release result states explanation (success, partial_success, failed, skipped) - Update tenant counts and asset ranges in runbook - Add gotchas for skipped tenants and release channel exclusions Made-with: Cursor * added glossary and attribute test * feat: reduce icarus memory from 4Gi to 2Gi (#6380) * Switch entity_audits to niofs store type to free page cache for vertex index (#6324) * Switch entity_audits ES index to niofs store type to eliminate page cache contention entity_audits uses the default hybridfs store type, which memory-maps all segment files at index open time. On production clusters, this consumes 19-400GB of virtual address space per node, competing with janusgraph_vertex_index for the OS page cache and degrading search performance. niofs uses Java NIO FileChannel.read() instead of mmap — audit pages only enter the page cache during active queries and are easily evictable, freeing page cache for the vertex index that actually needs it. Changes: - ESBasedAuditRepository: add ensureStoreTypeNiofs() to createSession() startup flow. Uses a marker document (HEAD check) so the close/open migration runs exactly once across all pods and all future deployments (~1ms no-op on subsequent startups). - es-audit-mappings.json: add settings block with store.type=niofs and refresh_interval=60s so new indices are created with niofs from the start. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move writeNiofsMigrationMarker() into finally block per review feedback Only write the marker when both the settings update and index reopen succeed, preventing partial migration state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change (#6381) * test(MS-701): add failing integration test for missing parent UPDATE on sub-asset add When a sub-asset (Process) is added with a relationship to a parent (Table) via bulk createOrUpdate, and the parent's own attributes haven't changed, the parent entity is incorrectly excluded from the UPDATE response and Kafka notifications. This test demonstrates the bug: it creates a Table, then sends a bulk request with the same unchanged Table + a new Process referencing it. The assertion that Table appears as UPDATED will fail until the fix is applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change When a sub-asset is added/deleted/restored via bulk createOrUpdate and the parent entity's own attributes are unchanged, the parent was incorrectly excluded from UPDATE notifications. This happened because RequestContext.recordEntityUpdate() checks entitiesToSkipUpdate, which blocks entities whose attributes didn't change — even when their relationships did change. Fix: Add recordEntityUpdateForRelationshipChange() to RequestContext that bypasses the entitiesToSkipUpdate check (following the existing pattern of recordEntityUpdateForNonRelationshipAttributes). Update all call sites in EntityGraphMapper and DeleteHandlerV1 that record parent entity updates due to relationship edge creation/deletion to use this new method. Affected call sites: - EntityGraphMapper.recordEntityUpdate(vertex) — simple relationship update - EntityGraphMapper.recordEntityUpdate(vertex, ctx, isAdd) — sub-asset add/remove - EntityGraphMapper inverse reference update (line ~1563) - DeleteHandlerV1.deleteEdgeReference() — both relationship and legacy edges Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci(MS-701): add SubAssetAddParentUpdateNotificationTest to CI integration test list The integration-tests.yml workflow uses an explicit -Dtest= list. Without this change, the new test would never run in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): rewrite test to use AtlasInProcessBaseIT instead of Docker The CI integration tests use AtlasInProcessBaseIT (starts Atlas in-process via Jetty with testcontainers for infra). The previous test extended AtlasDockerIntegrationTest which requires a private atlanhq/atlas:test Docker image not available in CI. Rewritten to use AtlasClientV2 API with the same test scenario: create Table, then bulk createOrUpdate with unchanged Table + new Process referencing it, and assert the Table appears as UPDATED in the response. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(MS-701): add Kafka ENTITY_UPDATE notification assertion for parent Table The test now verifies both: 1. REST response: Table appears in updatedEntities (existing) 2. Kafka: ENTITY_UPDATE notification emitted for Table on ATLAS_ENTITIES topic Uses ApplicationProperties to get kafka bootstrap servers (same pattern as AsyncIngestionIntegrationTest). Polls ATLAS_ENTITIES topic filtering by GUID + operationType + eventTime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): Validate build ran on ring branch, not just matching SHA (#6389) The pr-label-release workflow was checking only head_sha when validating builds, allowing releases to proceed using builds from non-ring branches that happened to share the same commit SHA. This caused an incident where ring-ms-864-keycloak-jwks-fix used a build from ms-864-keycloak-jwks-internal-url (both pointing to the same SHA) without any actual build running on the ring branch. Add branch validation for both maven build and integration tests to ensure the workflows actually ran on the expected ring branch. Made-with: Cursor * GOV-667 | Add duplicate policy name validation for Persona entities (#6375) * GOV-667: Validate if policy name exists or not * GOV-667: Removed comments * GOV-667: Added unit tests * GOV-667: allow purposes to have same names * GOV-667: Fix minor issues * GOV_667: only check for persona * GOV-667: Changes to correctly perform unit tests * GOV-667: Resolved review comments * GOVFOUN-235: v1 implementation for Datasets (#6172) * GOVFOUN-235: v1 implementation for Datasets * GOVFOUN-235: normalize datasetType * GOVFOUN-235: implement delete and make Qn immutable * GOVFOUN-235: block updates to element count attr * GOVFOUN-235: allow dataset to be linked to domain * GOVFOUN-235: fix delete type * GOVFOUN-235: Added tests * GOVFOUN-235: Fixed typeDefs * GOVFOUN-235: Fix tests * GOVFOUN-235: fix failing test * GOVFOUN-235: Fix minor big * GOVFOUN-235: allow admins to edit resources * GVOFOUN-235: Enrich dataset info for audit * GOVFOUN-235: Changes after typeDef review * GOVFOUN-235: fix tests * GOVFOUN-235: Resolve reviews * GOVFOUN-235: Reverting previous commit * fix: (MS-609) Improving Task Lifecycl Management in Apps Team Workflows (#6395) * updated tests --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com> * remove unwanted files * removed checks ginore * Testing harness extended (#6470) * chore: remove Tags V1 dead code from propagation tasks and ClassificationAssociator (#6305) - ClassificationPropagationTasks: remove isTagV2Enabled() branches in Add, UpdateText, Delete, and RefreshPropagation tasks. V2 path (Cassandra) is now always taken. Also removed unused previousRestrictPropagation* local vars from Add.run(). - ClassificationAssociator: remove V1-only updateClassificationText(null, allVertices) guarded by !isTagV2Enabled(). Remove now-unused DynamicConfigStore import. Part 1 of Tags V1 cleanup. Refs: MS-751 Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * Fix AtlasEntityHeader constructors to preserve docId, vertexId, and superTypeNames (#6304) The copy constructor AtlasEntityHeader(AtlasEntityHeader) and the entity-based constructor AtlasEntityHeader(AtlasEntity) were not copying docId, vertexId, or superTypeNames fields. When these constructors are used in the notification pipeline (e.g., convertDiffEntityToHeader), headers with null docId propagate to Elasticsearch, causing ES documents to lose their document sync references. This results in assets appearing as "not found" in the UI. Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> * bulk purge released forced refresh (#6323) * fix(icarus): dynamic JVM options for memory management and adjust CPU limits (#6330) * Add atlas_vertex_index ES alias for janusgraph_vertex_index on startup (#6336) Create a stable ES alias "atlas_vertex_index" pointing to the actual vertex index (e.g. janusgraph_vertex_index) during startup. This allows consumers to use a backend-agnostic index name. The alias is created once (idempotent check on every startup) and is best-effort — failures do not block Atlas startup. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: allow apostrophe in link URL validation regex (#6342) * fix: allow apostrophe in link URL validation regex Add unit tests for LinkPreProcessor URL validation. Add branch to CI for testing. * remove custom branch * added tests for evaluator, accessor, bulk unique attr * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS (#6347) * fix(helm): disable soft affinity for atlas-read cassandra-online-dc STS Remove multiarch preferredDuringSchedulingIgnoredDuringExecution blocks from the nodeAffinity section of the cassandra-online-dc StatefulSet in atlas-read. These blocks caused both soft (preferred) and hard (required) affinity rules to coexist when multiarch was enabled, leading to mixed affinity behavior. The STS now matches the normal atlas cassandra STS which only uses requiredDuringSchedulingIgnoredDuringExecution in non-Development/Enterprise deployments. Fixes: MS-803 Co-Authored-By: Claude Code <noreply@anthropic.com> * commit --------- Co-authored-by: Claude Code <noreply@anthropic.com> * added atlas mcp observability skills (#6315) * added atlas mcp skills * Removed hard paths in mcp.json * docs(cohort-release): Add auto-sync check, dynamic redistribution, and release channel filtering (#6366) - Document auto-sync safety check that skips tenants without ArgoCD auto-sync - Add dynamic ring redistribution section (quarterly automation, data sources) - Document release channel filtering (MAIN-BASE, GOLDEN-MAIN-BASE only) - Add release result states explanation (success, partial_success, failed, skipped) - Update tenant counts and asset ranges in runbook - Add gotchas for skipped tenants and release channel exclusions Made-with: Cursor * added glossary and attribute test * feat: reduce icarus memory from 4Gi to 2Gi (#6380) * Switch entity_audits to niofs store type to free page cache for vertex index (#6324) * Switch entity_audits ES index to niofs store type to eliminate page cache contention entity_audits uses the default hybridfs store type, which memory-maps all segment files at index open time. On production clusters, this consumes 19-400GB of virtual address space per node, competing with janusgraph_vertex_index for the OS page cache and degrading search performance. niofs uses Java NIO FileChannel.read() instead of mmap — audit pages only enter the page cache during active queries and are easily evictable, freeing page cache for the vertex index that actually needs it. Changes: - ESBasedAuditRepository: add ensureStoreTypeNiofs() to createSession() startup flow. Uses a marker document (HEAD check) so the close/open migration runs exactly once across all pods and all future deployments (~1ms no-op on subsequent startups). - es-audit-mappings.json: add settings block with store.type=niofs and refresh_interval=60s so new indices are created with niofs from the start. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move writeNiofsMigrationMarker() into finally block per review feedback Only write the marker when both the settings update and index reopen succeed, preventing partial migration state. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change (#6381) * test(MS-701): add failing integration test for missing parent UPDATE on sub-asset add When a sub-asset (Process) is added with a relationship to a parent (Table) via bulk createOrUpdate, and the parent's own attributes haven't changed, the parent entity is incorrectly excluded from the UPDATE response and Kafka notifications. This test demonstrates the bug: it creates a Table, then sends a bulk request with the same unchanged Table + a new Process referencing it. The assertion that Table appears as UPDATED will fail until the fix is applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): emit parent entity UPDATE event on sub-asset relationship change When a sub-asset is added/deleted/restored via bulk createOrUpdate and the parent entity's own attributes are unchanged, the parent was incorrectly excluded from UPDATE notifications. This happened because RequestContext.recordEntityUpdate() checks entitiesToSkipUpdate, which blocks entities whose attributes didn't change — even when their relationships did change. Fix: Add recordEntityUpdateForRelationshipChange() to RequestContext that bypasses the entitiesToSkipUpdate check (following the existing pattern of recordEntityUpdateForNonRelationshipAttributes). Update all call sites in EntityGraphMapper and DeleteHandlerV1 that record parent entity updates due to relationship edge creation/deletion to use this new method. Affected call sites: - EntityGraphMapper.recordEntityUpdate(vertex) — simple relationship update - EntityGraphMapper.recordEntityUpdate(vertex, ctx, isAdd) — sub-asset add/remove - EntityGraphMapper inverse reference update (line ~1563) - DeleteHandlerV1.deleteEdgeReference() — both relationship and legacy edges Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci(MS-701): add SubAssetAddParentUpdateNotificationTest to CI integration test list The integration-tests.yml workflow uses an explicit -Dtest= list. Without this change, the new test would never run in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(MS-701): rewrite test to use AtlasInProcessBaseIT instead of Docker The CI integration tests use AtlasInProcessBaseIT (starts Atlas in-process via Jetty with testcontainers for infra). The previous test extended AtlasDockerIntegrationTest which requires a private atlanhq/atlas:test Docker image not available in CI. Rewritten to use AtlasClientV2 API with the same test scenario: create Table, then bulk createOrUpdate with unchanged Table + new Process referencing it, and assert the Table appears as UPDATED in the response. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(MS-701): add Kafka ENTITY_UPDATE notification assertion for parent Table The test now verifies both: 1. REST response: Table appears in updatedEntities (existing) 2. Kafka: ENTITY_UPDATE notification emitted for Table on ATLAS_ENTITIES topic Uses ApplicationProperties to get kafka bootstrap servers (same pattern as AsyncIngestionIntegrationTest). Polls ATLAS_ENTITIES topic filtering by GUID + operationType + eventTime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): Validate build ran on ring branch, not just matching SHA (#6389) The pr-label-release workflow was checking only head_sha when validating builds, allowing releases to proceed using builds from non-ring branches that happened to share the same commit SHA. This caused an incident where ring-ms-864-keycloak-jwks-fix used a build from ms-864-keycloak-jwks-internal-url (both pointing to the same SHA) without any actual build running on the ring branch. Add branch validation for both maven build and integration tests to ensure the workflows actually ran on the expected ring branch. Made-with: Cursor * GOV-667 | Add duplicate policy name validation for Persona entities (#6375) * GOV-667: Validate if policy name exists or not * GOV-667: Removed comments * GOV-667: Added unit tests * GOV-667: allow purposes to have same names * GOV-667: Fix minor issues * GOV_667: only check for persona * GOV-667: Changes to correctly perform unit tests * GOV-667: Resolved review comments * GOVFOUN-235: v1 implementation for Datasets (#6172) * GOVFOUN-235: v1 implementation for Datasets * GOVFOUN-235: normalize datasetType * GOVFOUN-235: implement delete and make Qn immutable * GOVFOUN-235: block updates to element count attr * GOVFOUN-235: allow dataset to be linked to domain * GOVFOUN-235: fix delete type * GOVFOUN-235: Added tests * GOVFOUN-235: Fixed typeDefs * GOVFOUN-235: Fix tests * GOVFOUN-235: fix failing test * GOVFOUN-235: Fix minor big * GOVFOUN-235: allow admins to edit resources * GVOFOUN-235: Enrich dataset info for audit * GOVFOUN-235: Changes after typeDef review * GOVFOUN-235: fix tests * GOVFOUN-235: Resolve reviews * GOVFOUN-235: Reverting previous commit * fix: (MS-609) Improving Task Lifecycl Management in Apps Team Workflows (#6395) * updated tests * fixed tests --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com> --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com> * fixed qn and glossary move * removed atlas.java , checked out from master * removed atlas.java , checked out from master * added readMe.md * added claude skill to run tests and do review * added html reporter * backmerged form master * removed purge, added tenant checks * fixed produciton guardrails --------- Co-authored-by: Arnab Saha <arniesaha@gmail.com> Co-authored-by: MetaClaw <metaclaw@atlan.com> Co-authored-by: sriram-atlan <sriram.aravamuthan@atlan.com> Co-authored-by: mothership-ai[bot] <246624273+mothership-ai[bot]@users.noreply.github.com> Co-authored-by: Mothership Agent <mothership@atlan.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Syed <150783904+syed-atlan@users.noreply.github.com> Co-authored-by: LijiAlex <liji.a@atlan.com> Co-authored-by: Hitesh Khandelwal <60309732+hitk6@users.noreply.github.com> Co-authored-by: krishnanunni-atlan <krishnanunni.m@atlan.com> Co-authored-by: ankitpatnaik-atlan <ankit.patnaik@atlan.com> Co-authored-by: salman-atlan <salman.khurshid@atlan.com>
* ms-696: Es sync redesign for tags propagation flow * ms-696: Removed the DLQReplayService * ms-696: Hardening DLQ flow * ms-696: Resolve PR comments * ms-696: Removed bad code that can cause OOM issue * ms-696: Fix the source Tag issue * ms-696: Resolved comments * ms-696: Resolved prometheus review suggestion * ms-696: Better diff for this file * ms-696: Added comments back * ms-696: Removed custom image * ms-696: Ring Dummy commit * ms-696: Add topic to helm config * ms-696: Added adaptive retry * ms-696: Reduce async read batch size to 30 * ms-696: Handle task status for ES failures * ms-696: Updated doc * ms-696: Resolved PR comments * MS-696 : Tag Denorm DLQ Replay Service (#6496) * ms-696: DLQ Replay Service * ms-696: Added Task refrence in dlq * ms-696: Added Batching in DLQ replay service * ms-696: Prevent memory leak * ms-696: Handled Kafka connection error * ms-696: Set maxPollRecords to 1 default value * ms-696: Resolved PR comments and added metrics for consumer * ms-696: Resolved indefinite retry error * ms-696: Resolved metric errors * ms-696: Index Task ES status * dummy commit --------- Co-authored-by: Krishnanunni M <krishnanunni.m@atlan.com>
…#6526) * fix: (ms-928) Native ES nested type mapping support in typedef seeder * fix: add indexTypeESMapping to equals(), hashCode(), toString() in AtlasAttributeDef * fix: address review comments — fatal CREATE, non-fatal UPDATE, fix IndexRepairConsumer build - Split applyESNestedMappings into CREATE (fatal) and UPDATE (non-fatal) paths - Add INDEX_REPAIR_CONSUMER_ENABLED, INDEX_REPAIR_BATCH_SIZE, INDEX_REPAIR_BATCH_DELAY_MS to AtlasConfiguration - Add INDEX_REPAIR_CONSUMER to ActiveStateChangeHandler.HandlerOrder - Add REINDEX_REPAIRED_ATTRIBUTES to AtlasTaskType enum Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove IndexRepairConsumer constants — not part of MS-928 scope Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
) * feat: add bootstrap entity policies for Context Studio types Add READ and CUD bootstrap policies for ContextRepository and ContextArtifact entity types. READ access for all roles, CUD restricted to admin and API token access. * feat: add relationship policy for ContextRepository → ContextArtifact Allows admin and API tokens to add/update/remove the containment relationship between ContextRepository and ContextArtifact entities. * feat: separate Skill + Context entity and relationship policies Split into independent policy sets: - READ/CUD_SKILL_ENTITIES for Skill + SkillArtifact - READ/CUD_CONTEXT_ENTITIES for ContextRepository + ContextArtifact - LINK_SKILL_TO_SKILL_ARTIFACT relationship policy (new) - LINK_CONTEXT_REPOSITORY_TO_ARTIFACT relationship policy (existing)
…-665) When a new entity (e.g., DataContract) is created in a bulk request alongside a relationship attribute update on another entity (e.g., setting dataContractLatest on a Table), the relationship target has a temporary GUID that hasn't been resolved to the assigned GUID yet. This causes the vertex lookup to fail and the relationship edge to never be created. The fix mirrors the existing pattern in mapSoftRefValue() (line 1493) which already handles this correctly by checking context.getGuidAssignments() for the temporary-to- assigned GUID mapping. Impact: Fixes SDK-created DataContracts where dataContractLatest relationship was not being set, causing orphaned contracts invisible from the asset's contract tab. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…odes Add processDelete override to ContractPreProcessor with two modes: - Delete all versions (default): cascade-deletes all contract versions for an asset, cleans up hasContract attribute - Delete latest only (x-atlan-contract-delete-scope: single header): deletes only the latest version, promotes previous version Patterns follow PersonaPreProcessor cascade delete and ConnectionPreProcessor soft-delete skip. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… (DQ-665) When deleting only the latest contract version, the dataContractLatest relationship edge was left in DELETED state because JanusGraph doesn't restore previously-replaced edges. This caused the UI to show "something went wrong" on the asset's contract tab. Fix: restoreAssetContractPointers() uses entityStore.createOrUpdate() to re-establish the dataContractLatest (and dataContractLatestCertified if VERIFIED) relationship edges pointing to the previous version. Also adds __state=ACTIVE filter to getSecondLatestVersion ES query to avoid matching soft-deleted contracts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
39361f0 to
a478a24
Compare
- Save/restore skipAuthorizationCheck original value instead of hardcoding false (matches StakeholderTitlePreProcessor pattern) - Use deleteByIds batch instead of per-element deleteById loop - Remove branch from maven.yml CI trigger Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Will remove before merge to master. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
956cd7a to
30b6733
Compare
Without this filter, soft-deleted contracts could be returned as the "current version", causing processSingleVersionDelete to incorrectly reject valid delete operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…owed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…DQ-665) Reverts the graphHelper.getOrCreateEdge refactor. The createOrUpdate approach for restoring asset contract pointers is an established pattern used by other preprocessors. Re-adds CI branch trigger for image build. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e06a98e to
8003620
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix. Create an issue in ASF JIRA before opening a pull request and
set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. ATLAS-XXXX: Fix a typo in YYY))
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)