docs: add further refactoring recommendations for ContentService

Post-Phase 8 analysis identifying improvement opportunities:
- High priority: batch HasChildren to fix 142% regression
- Medium priority: N+1 query fixes (GetIdsForKeys, GetSchedulesByContentIds)
- Low priority: split ContentPublishOperationService (1758 lines)

Includes implementation roadmap and success metrics.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-12-25 00:03:35 +00:00
parent 6af7726c39
commit 43031ca472

View File

@@ -0,0 +1,325 @@
# ContentService Refactoring - Further Recommendations
**Generated:** 2025-12-25
**Branch:** `refactor/ContentService`
**Status:** Post-Phase 8 analysis
---
## Executive Summary
The ContentService refactoring successfully reduced a 3800-line monolithic class to a 923-line facade with 7 specialized services. Performance benchmarks show an overall 4.1% improvement, but identified specific areas for further optimization.
This document outlines actionable recommendations to address:
- One significant performance regression (HasChildren +142%)
- Unimplemented N+1 query optimizations from the original design
- Architectural improvements for long-term maintainability
---
## High Priority Recommendations
### 1. Fix HasChildren Regression
**Problem:** The `HasChildren_100Nodes` benchmark shows a +142% regression (65ms → 157ms).
**Root Cause:** Each `HasChildren(int id)` call creates a new scope and executes a separate COUNT query:
```csharp
// Current implementation in ContentCrudService.cs:380-388
public bool HasChildren(int id)
{
using (ICoreScope scope = ScopeProvider.CreateCoreScope(autoComplete: true))
{
scope.ReadLock(Constants.Locks.ContentTree);
IQuery<IContent>? query = Query<IContent>()?.Where(x => x.ParentId == id);
var count = DocumentRepository.Count(query);
return count > 0;
}
}
```
When called 100 times (as in the benchmark), this creates 100 scopes, acquires 100 read locks, and executes 100 database queries.
**Solution:** Add a batch `HasChildren` overload:
```csharp
// Add to IContentCrudService.cs
/// <summary>
/// Checks if multiple content items have children.
/// </summary>
/// <param name="ids">The content IDs to check.</param>
/// <returns>Dictionary mapping each ID to whether it has children.</returns>
/// <remarks>
/// Performance: Single database query regardless of input size.
/// Use this instead of calling HasChildren(int) in a loop.
/// </remarks>
IReadOnlyDictionary<int, bool> HasChildren(IEnumerable<int> ids);
```
```csharp
// Implementation in ContentCrudService.cs
public IReadOnlyDictionary<int, bool> HasChildren(IEnumerable<int> ids)
{
var idList = ids.ToList();
if (idList.Count == 0)
return new Dictionary<int, bool>();
using var scope = ScopeProvider.CreateCoreScope(autoComplete: true);
scope.ReadLock(Constants.Locks.ContentTree);
// Single query: SELECT ParentId, COUNT(*) FROM umbracoNode
// WHERE ParentId IN (...) GROUP BY ParentId
var childCounts = DocumentRepository.CountChildrenByParentIds(idList);
return idList.ToDictionary(
id => id,
id => childCounts.GetValueOrDefault(id, 0) > 0
);
}
```
**Repository addition required:**
```csharp
// Add to IDocumentRepository.cs
IReadOnlyDictionary<int, int> CountChildrenByParentIds(IEnumerable<int> parentIds);
```
**Effort:** 2-4 hours
**Impact:** Reduces 100 database round-trips to 1, fixing the 142% regression
---
## Medium Priority Recommendations
### 2. Implement Planned N+1 Query Fixes
The original design document (`2025-12-19-contentservice-refactor-design.md`) lists batch operations that were planned but not implemented:
| Planned Method | Purpose | Location |
|----------------|---------|----------|
| `GetIdsForKeys(Guid[] keys)` | Batch key-to-id resolution | IIdKeyMap |
| `GetSchedulesByContentIds(int[] ids)` | Batch schedule lookups | ContentScheduleRepository |
| `ArePathsPublished(int[] contentIds)` | Batch path validation | ContentCrudService |
| `GetParents(int[] contentIds)` | Batch ancestor lookups | ContentCrudService |
**Key hotspots identified in design doc:**
- `GetContentSchedulesByIds` (line 1025-1049): N+1 in `_idKeyMap.GetIdForKey` calls
- `IsPathPublishable` (line 1070): Repeated single-item lookups
- `GetAncestors` (line 792): Repeated single-item lookups
**Recommended implementation order:**
1. `GetIdsForKeys` - Used across multiple services
2. `GetSchedulesByContentIds` - Direct performance impact
3. `ArePathsPublished` - Affects publish operations
4. `GetParents` - Affects tree operations
**Effort:** 4-8 hours per method
**Impact:** Eliminates N+1 query patterns in identified hotspots
---
### 3. Reduce Scope Creation Overhead
Many single-item operations create a new scope per call. For repeated operations, this adds measurable overhead.
**Current pattern:**
```csharp
public bool HasChildren(int id)
{
using (ICoreScope scope = ScopeProvider.CreateCoreScope(autoComplete: true))
{
// ... operation
}
}
```
**Improved pattern - add internal overloads:**
```csharp
// Public API - creates scope
public bool HasChildren(int id)
{
using var scope = ScopeProvider.CreateCoreScope(autoComplete: true);
scope.ReadLock(Constants.Locks.ContentTree);
return HasChildrenInternal(id);
}
// Internal - reuses existing scope (for batch operations)
internal bool HasChildrenInternal(int id)
{
IQuery<IContent>? query = Query<IContent>()?.Where(x => x.ParentId == id);
return DocumentRepository.Count(query) > 0;
}
// Batch operation reuses scope for all items
public IReadOnlyDictionary<int, bool> HasChildren(IEnumerable<int> ids)
{
using var scope = ScopeProvider.CreateCoreScope(autoComplete: true);
scope.ReadLock(Constants.Locks.ContentTree);
// Prefer single query, but fallback shows the pattern
return ids.ToDictionary(id => id, HasChildrenInternal);
}
```
**Effort:** 2-4 hours
**Impact:** Reduces overhead for internal batch operations
---
### 4. Add Missing Lock Documentation
The design document specifies that each service should document its lock contracts. This was partially implemented but could be more comprehensive.
**Current state:** Some methods have lock documentation in XML comments.
**Recommended additions:**
```csharp
/// <summary>
/// Saves content items.
/// </summary>
/// <remarks>
/// <para><b>Lock contract:</b> Acquires WriteLock(ContentTree) before any modifications.</para>
/// <para><b>Cache invalidation:</b> Fires ContentCacheRefresher for all saved items.</para>
/// <para><b>Notifications:</b> ContentSavingNotification (cancellable), ContentSavedNotification.</para>
/// </remarks>
OperationResult Save(IContent content, int userId = Constants.Security.SuperUserId);
```
**Effort:** 2-4 hours
**Impact:** Improved developer experience and debugging
---
## Low Priority Recommendations
### 5. Consider Splitting ContentPublishOperationService
At 1758 lines, `ContentPublishOperationService` is the largest service - exceeding the original goal of keeping services under 800 lines.
**Current line counts:**
| Service | Lines |
|---------|------:|
| ContentService (facade) | 923 |
| ContentCrudService | 806 |
| ContentPublishOperationService | **1758** |
| ContentMoveOperationService | 605 |
| ContentVersionOperationService | 230 |
| ContentQueryOperationService | 169 |
**Potential extractions:**
1. **ContentScheduleService** (~200-300 lines)
- `PerformScheduledPublish`
- `GetContentSchedulesByIds`
- Schedule-related validation
2. **ContentPublishBranchService** (~300-400 lines)
- `PublishBranch`
- `PublishBranchInternal`
- Branch validation logic
**Effort:** 8-16 hours
**Impact:** Better maintainability, clearer separation of concerns
---
### 6. Add Performance Documentation to Interfaces
Help developers choose the right methods by documenting performance characteristics:
```csharp
/// <summary>
/// Gets a single content item by ID.
/// </summary>
/// <remarks>
/// <para><b>Performance:</b> O(1) database query.</para>
/// <para>For multiple items, use <see cref="GetByIds(IEnumerable{int})"/>
/// to avoid N+1 queries.</para>
/// </remarks>
IContent? GetById(int id);
/// <summary>
/// Gets multiple content items by ID.
/// </summary>
/// <remarks>
/// <para><b>Performance:</b> Single database query regardless of input size.</para>
/// <para>Preferred over calling GetById in a loop.</para>
/// </remarks>
IEnumerable<IContent> GetByIds(IEnumerable<int> ids);
```
**Effort:** 2-4 hours
**Impact:** Helps developers avoid performance pitfalls
---
### 7. Address Remaining TODO Comments
One TODO was found in the codebase:
```csharp
// src/Umbraco.Infrastructure/Services/ContentListViewServiceBase.cs:227
// TODO: Optimize the way we filter out only the nodes the user is allowed to see -
// instead of checking one by one
```
This suggests a similar N+1 pattern in permission checking that could benefit from batch optimization.
**Effort:** 4-8 hours
**Impact:** Improved list view performance
---
## Implementation Roadmap
### Phase 1: Quick Wins (1-2 days)
- [ ] Implement batch `HasChildren(IEnumerable<int>)`
- [ ] Add `CountChildrenByParentIds` to repository
- [ ] Update benchmark to use batch method where appropriate
### Phase 2: N+1 Elimination (3-5 days)
- [ ] Implement `GetIdsForKeys(Guid[] keys)` in IIdKeyMap
- [ ] Implement `GetSchedulesByContentIds(int[] ids)`
- [ ] Add internal scope-reusing method overloads
### Phase 3: Documentation & Polish (1-2 days)
- [ ] Add lock contract documentation to all public methods
- [ ] Add performance documentation to interfaces
- [ ] Update design document with lessons learned
### Phase 4: Architectural (Optional, 1-2 weeks)
- [ ] Evaluate splitting ContentPublishOperationService
- [ ] Address ContentListViewServiceBase TODO
- [ ] Implement remaining batch methods (ArePathsPublished, GetParents)
---
## Success Metrics
After implementing these recommendations, re-run benchmarks and verify:
| Metric | Current | Target |
|--------|---------|--------|
| HasChildren_100Nodes | 157ms | <70ms (back to baseline) |
| Overall regression count | 5 | 0 (none >20%) |
| ContentPublishOperationService size | 1758 lines | <1000 lines |
| N+1 hotspots | 4 identified | 0 |
---
## References
- **Benchmark Results:** `docs/plans/PerformanceBenchmarks.md`
- **Original Design:** `docs/plans/2025-12-19-contentservice-refactor-design.md`
- **Phase Summaries:** `docs/plans/2025-12-2*-contentservice-refactor-phase*-implementation-summary*.md`
---
## Conclusion
The ContentService refactoring achieved its primary goals of improved maintainability and code organization. The recommendations in this document represent "Phase 2" optimizations that would further improve performance and address technical debt identified during the refactoring process.
The highest-impact change is implementing batch `HasChildren`, which would resolve the only significant performance regression and establish a pattern for eliminating other N+1 queries throughout the codebase.