Post-Phase 8 analysis identifying improvement opportunities: - High priority: batch HasChildren to fix 142% regression - Medium priority: N+1 query fixes (GetIdsForKeys, GetSchedulesByContentIds) - Low priority: split ContentPublishOperationService (1758 lines) Includes implementation roadmap and success metrics. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
10 KiB
ContentService Refactoring - Further Recommendations
Generated: 2025-12-25
Branch: refactor/ContentService
Status: Post-Phase 8 analysis
Executive Summary
The ContentService refactoring successfully reduced a 3800-line monolithic class to a 923-line facade with 7 specialized services. Performance benchmarks show an overall 4.1% improvement, but identified specific areas for further optimization.
This document outlines actionable recommendations to address:
- One significant performance regression (HasChildren +142%)
- Unimplemented N+1 query optimizations from the original design
- Architectural improvements for long-term maintainability
High Priority Recommendations
1. Fix HasChildren Regression
Problem: The HasChildren_100Nodes benchmark shows a +142% regression (65ms → 157ms).
Root Cause: Each HasChildren(int id) call creates a new scope and executes a separate COUNT query:
// Current implementation in ContentCrudService.cs:380-388
public bool HasChildren(int id)
{
using (ICoreScope scope = ScopeProvider.CreateCoreScope(autoComplete: true))
{
scope.ReadLock(Constants.Locks.ContentTree);
IQuery<IContent>? query = Query<IContent>()?.Where(x => x.ParentId == id);
var count = DocumentRepository.Count(query);
return count > 0;
}
}
When called 100 times (as in the benchmark), this creates 100 scopes, acquires 100 read locks, and executes 100 database queries.
Solution: Add a batch HasChildren overload:
// Add to IContentCrudService.cs
/// <summary>
/// Checks if multiple content items have children.
/// </summary>
/// <param name="ids">The content IDs to check.</param>
/// <returns>Dictionary mapping each ID to whether it has children.</returns>
/// <remarks>
/// Performance: Single database query regardless of input size.
/// Use this instead of calling HasChildren(int) in a loop.
/// </remarks>
IReadOnlyDictionary<int, bool> HasChildren(IEnumerable<int> ids);
// Implementation in ContentCrudService.cs
public IReadOnlyDictionary<int, bool> HasChildren(IEnumerable<int> ids)
{
var idList = ids.ToList();
if (idList.Count == 0)
return new Dictionary<int, bool>();
using var scope = ScopeProvider.CreateCoreScope(autoComplete: true);
scope.ReadLock(Constants.Locks.ContentTree);
// Single query: SELECT ParentId, COUNT(*) FROM umbracoNode
// WHERE ParentId IN (...) GROUP BY ParentId
var childCounts = DocumentRepository.CountChildrenByParentIds(idList);
return idList.ToDictionary(
id => id,
id => childCounts.GetValueOrDefault(id, 0) > 0
);
}
Repository addition required:
// Add to IDocumentRepository.cs
IReadOnlyDictionary<int, int> CountChildrenByParentIds(IEnumerable<int> parentIds);
Effort: 2-4 hours Impact: Reduces 100 database round-trips to 1, fixing the 142% regression
Medium Priority Recommendations
2. Implement Planned N+1 Query Fixes
The original design document (2025-12-19-contentservice-refactor-design.md) lists batch operations that were planned but not implemented:
| Planned Method | Purpose | Location |
|---|---|---|
GetIdsForKeys(Guid[] keys) |
Batch key-to-id resolution | IIdKeyMap |
GetSchedulesByContentIds(int[] ids) |
Batch schedule lookups | ContentScheduleRepository |
ArePathsPublished(int[] contentIds) |
Batch path validation | ContentCrudService |
GetParents(int[] contentIds) |
Batch ancestor lookups | ContentCrudService |
Key hotspots identified in design doc:
GetContentSchedulesByIds(line 1025-1049): N+1 in_idKeyMap.GetIdForKeycallsIsPathPublishable(line 1070): Repeated single-item lookupsGetAncestors(line 792): Repeated single-item lookups
Recommended implementation order:
GetIdsForKeys- Used across multiple servicesGetSchedulesByContentIds- Direct performance impactArePathsPublished- Affects publish operationsGetParents- Affects tree operations
Effort: 4-8 hours per method Impact: Eliminates N+1 query patterns in identified hotspots
3. Reduce Scope Creation Overhead
Many single-item operations create a new scope per call. For repeated operations, this adds measurable overhead.
Current pattern:
public bool HasChildren(int id)
{
using (ICoreScope scope = ScopeProvider.CreateCoreScope(autoComplete: true))
{
// ... operation
}
}
Improved pattern - add internal overloads:
// Public API - creates scope
public bool HasChildren(int id)
{
using var scope = ScopeProvider.CreateCoreScope(autoComplete: true);
scope.ReadLock(Constants.Locks.ContentTree);
return HasChildrenInternal(id);
}
// Internal - reuses existing scope (for batch operations)
internal bool HasChildrenInternal(int id)
{
IQuery<IContent>? query = Query<IContent>()?.Where(x => x.ParentId == id);
return DocumentRepository.Count(query) > 0;
}
// Batch operation reuses scope for all items
public IReadOnlyDictionary<int, bool> HasChildren(IEnumerable<int> ids)
{
using var scope = ScopeProvider.CreateCoreScope(autoComplete: true);
scope.ReadLock(Constants.Locks.ContentTree);
// Prefer single query, but fallback shows the pattern
return ids.ToDictionary(id => id, HasChildrenInternal);
}
Effort: 2-4 hours Impact: Reduces overhead for internal batch operations
4. Add Missing Lock Documentation
The design document specifies that each service should document its lock contracts. This was partially implemented but could be more comprehensive.
Current state: Some methods have lock documentation in XML comments.
Recommended additions:
/// <summary>
/// Saves content items.
/// </summary>
/// <remarks>
/// <para><b>Lock contract:</b> Acquires WriteLock(ContentTree) before any modifications.</para>
/// <para><b>Cache invalidation:</b> Fires ContentCacheRefresher for all saved items.</para>
/// <para><b>Notifications:</b> ContentSavingNotification (cancellable), ContentSavedNotification.</para>
/// </remarks>
OperationResult Save(IContent content, int userId = Constants.Security.SuperUserId);
Effort: 2-4 hours Impact: Improved developer experience and debugging
Low Priority Recommendations
5. Consider Splitting ContentPublishOperationService
At 1758 lines, ContentPublishOperationService is the largest service - exceeding the original goal of keeping services under 800 lines.
Current line counts:
| Service | Lines |
|---|---|
| ContentService (facade) | 923 |
| ContentCrudService | 806 |
| ContentPublishOperationService | 1758 |
| ContentMoveOperationService | 605 |
| ContentVersionOperationService | 230 |
| ContentQueryOperationService | 169 |
Potential extractions:
-
ContentScheduleService (~200-300 lines)
PerformScheduledPublishGetContentSchedulesByIds- Schedule-related validation
-
ContentPublishBranchService (~300-400 lines)
PublishBranchPublishBranchInternal- Branch validation logic
Effort: 8-16 hours Impact: Better maintainability, clearer separation of concerns
6. Add Performance Documentation to Interfaces
Help developers choose the right methods by documenting performance characteristics:
/// <summary>
/// Gets a single content item by ID.
/// </summary>
/// <remarks>
/// <para><b>Performance:</b> O(1) database query.</para>
/// <para>For multiple items, use <see cref="GetByIds(IEnumerable{int})"/>
/// to avoid N+1 queries.</para>
/// </remarks>
IContent? GetById(int id);
/// <summary>
/// Gets multiple content items by ID.
/// </summary>
/// <remarks>
/// <para><b>Performance:</b> Single database query regardless of input size.</para>
/// <para>Preferred over calling GetById in a loop.</para>
/// </remarks>
IEnumerable<IContent> GetByIds(IEnumerable<int> ids);
Effort: 2-4 hours Impact: Helps developers avoid performance pitfalls
7. Address Remaining TODO Comments
One TODO was found in the codebase:
// src/Umbraco.Infrastructure/Services/ContentListViewServiceBase.cs:227
// TODO: Optimize the way we filter out only the nodes the user is allowed to see -
// instead of checking one by one
This suggests a similar N+1 pattern in permission checking that could benefit from batch optimization.
Effort: 4-8 hours Impact: Improved list view performance
Implementation Roadmap
Phase 1: Quick Wins (1-2 days)
- Implement batch
HasChildren(IEnumerable<int>) - Add
CountChildrenByParentIdsto repository - Update benchmark to use batch method where appropriate
Phase 2: N+1 Elimination (3-5 days)
- Implement
GetIdsForKeys(Guid[] keys)in IIdKeyMap - Implement
GetSchedulesByContentIds(int[] ids) - Add internal scope-reusing method overloads
Phase 3: Documentation & Polish (1-2 days)
- Add lock contract documentation to all public methods
- Add performance documentation to interfaces
- Update design document with lessons learned
Phase 4: Architectural (Optional, 1-2 weeks)
- Evaluate splitting ContentPublishOperationService
- Address ContentListViewServiceBase TODO
- Implement remaining batch methods (ArePathsPublished, GetParents)
Success Metrics
After implementing these recommendations, re-run benchmarks and verify:
| Metric | Current | Target |
|---|---|---|
| HasChildren_100Nodes | 157ms | <70ms (back to baseline) |
| Overall regression count | 5 | 0 (none >20%) |
| ContentPublishOperationService size | 1758 lines | <1000 lines |
| N+1 hotspots | 4 identified | 0 |
References
- Benchmark Results:
docs/plans/PerformanceBenchmarks.md - Original Design:
docs/plans/2025-12-19-contentservice-refactor-design.md - Phase Summaries:
docs/plans/2025-12-2*-contentservice-refactor-phase*-implementation-summary*.md
Conclusion
The ContentService refactoring achieved its primary goals of improved maintainability and code organization. The recommendations in this document represent "Phase 2" optimizations that would further improve performance and address technical debt identified during the refactoring process.
The highest-impact change is implementing batch HasChildren, which would resolve the only significant performance regression and establish a pattern for eliminating other N+1 queries throughout the codebase.