构建驱动推荐系统优化的iOS情感反馈Kit及其可观测性设计

移动端开发

文章字数: 3.5k

阅读时长: 15 分

我们团队的推荐流（Feed）遇到了一个棘手的数据难题。业务方期望通过用户的负反馈（“不喜欢”、“减少这类推荐”）来快速迭代推荐模型，但我们现有的埋点系统完全无法支撑这个需求。现有的方案仅仅是在用户点击“不喜欢”时，向服务端发送一个独立的、无上下文的HTTP请求。这种方式导致了几个严重问题：

上下文丢失： 后端收到的“不喜欢”事件，无法准确关联到是哪个版本的推荐模型、在哪一次“刷”次中、基于什么用户画像、在哪个位置生成的哪一个具体推荐项（Impression ID）。数据成了孤岛。
数据可靠性差： 请求是“发后即忘”的。在弱网或离线环境下，大量的用户反馈信号直接丢失。
代码实现混乱： 每个需要反馈功能的UI组件，都在各自的视图控制器里复制粘贴着几乎一样的网络请求代码，缺乏统一的UI/UX和逻辑封装，维护成本极高。

为了根治这个问题，我们决定从头构建一个独立的、可复用的情感反馈组件——EmotionFeedbackKit。它的核心使命不再是简单地“发送一个请求”，而是要成为一个高可用的、携带完整上下文的“数据探针”。

初步构想与技术选型

我们的目标是创建一个自包含的Kit，它必须满足以下要求：

封装性： 对调用方（如Feed流的Cell）来说，集成应该极其简单，只需提供必要的上下文数据即可。
状态管理： Kit内部需要管理自身完整的UI状态（如初始、加载中、反馈成功、反馈失败、展示原因选项等）。
上下文注入： 必须设计一套清晰的API，允许调用方注入推荐上下文，如 impressionId, modelInfo, feedPosition 等。
数据可靠性： 内建一套轻量级的队列与批处理机制，支持请求重试和本地缓存，确保在网络不佳时反馈信号不会丢失。

围绕这些目标，我们进行了几个关键的技术选型决策：

架构模式： 在Kit内部，我们采用了MVVM模式。EmotionFeedbackView (View) 负责UI渲染，EmotionFeedbackViewModel (ViewModel) 负责状态管理和业务逻辑，FeedbackTelemetryService (Model/Service) 负责数据的暂存、批处理和上报。
数据传输契约： 我们没有直接使用简单的JSON字典，而是定义了强类型的Codable结构体FeedbackEvent。这确保了数据结构的稳定性和类型安全，便于前后端协作。
数据上报策略： 放弃了传统的即时上报，转而实现一个BatchingTelemetryClient。它会在内存中维护一个事件队列，当队列大小达到阈值或定时器触发时，将多个事件打包成一批进行上报。这种方式能显著降低网络请求频率和服务器压力。在真实项目中，为了防止应用退出导致内存数据丢失，这个队列应该被持久化到磁盘，但为了在本文中聚焦核心逻辑，我们先从内存队列开始。

步骤化实现：从数据层到视图层

1. 定义数据契约与上报服务

这是整个Kit的基石。一个定义良好的数据结构能让后续所有工作事半功倍。

// EmotionFeedbackKit/DataModels/FeedbackContext.swift

import Foundation

/// 推荐上下文信息，由调用方在创建Kit时提供
public struct RecommendationContext: Codable, Hashable {
    /// 曝光ID，唯一标识一次推荐内容的展示
    let impressionId: String
    /// 产生该推荐的模型版本及相关信息
    let modelInfo: String
    /// 内容在Feed流中的位置
    let feedPosition: Int
    /// 其他需要追踪的业务字段
    let businessPayload: [String: String]

    public init(impressionId: String, modelInfo: String, feedPosition: Int, businessPayload: [String : String] = [:]) {
        self.impressionId = impressionId
        self.modelInfo = modelInfo
        self.feedPosition = feedPosition
        self.businessPayload = businessPayload
    }
}

/// 用户反馈类型
public enum FeedbackType: String, Codable {
    case like
    case dislike
    case neutral // 初始状态或撤销
}

/// 最终上报的事件结构体
struct FeedbackEvent: Codable, Identifiable {
    let id: UUID
    let timestamp: Date
    let feedbackType: FeedbackType
    /// 负反馈的具体原因，可选
    let reason: String?
    /// 关联的推荐上下文
    let context: RecommendationContext

    init(feedbackType: FeedbackType, reason: String? = nil, context: RecommendationContext) {
        self.id = UUID()
        self.timestamp = Date()
        self.feedbackType = feedbackType
        self.reason = reason
        self.context = context
    }
}

接下来是负责处理这些事件的上报服务。这里的核心是批处理逻辑，而不是简单的网络请求。

// EmotionFeedbackKit/Services/BatchingTelemetryClient.swift

import Foundation
import Combine

// 定义一个协议，方便未来替换实现或进行测试Mock
protocol FeedbackTelemetryService {
    func submit(event: FeedbackEvent)
}

final class BatchingTelemetryClient: FeedbackTelemetryService {

    // MARK: - Configuration
    private struct Config {
        static let batchSizeThreshold = 10 // 队列中事件数量达到10个，触发上报
        static let flushInterval: TimeInterval = 30.0 // 每隔30秒，无论数量多少都触发上报
        static let maxRetries = 3 // 单次批处理失败的最大重试次数
        static let retryBaseDelay: TimeInterval = 2.0 // 重试基础延迟时间（指数退避）
        static let endpoint = URL(string: "https://api.example.com/v1/feedback/batch")!
    }

    // MARK: - Properties
    private var eventQueue: [FeedbackEvent] = []
    private let queueLock = NSLock() // 保证多线程写入队列安全
    private var flushTimer: Timer?
    private let urlSession: URLSession

    // 用于处理网络请求的Cancellable
    private var activeRequest: AnyCancellable?

    init(urlSession: URLSession = .shared) {
        self.urlSession = urlSession
        setupFlushTimer()
        
        // 监听应用生命周期，在应用进入后台时，主动触发一次上报
        NotificationCenter.default.addObserver(self, selector: #selector(handleAppWillResignActive), name: UIScene.willDeactivateNotification, object: nil)
    }

    deinit {
        flushTimer?.invalidate()
        NotificationCenter.default.removeObserver(self)
    }

    // MARK: - Public API
    func submit(event: FeedbackEvent) {
        queueLock.lock()
        defer { queueLock.unlock() }

        eventQueue.append(event)
        
        // 打印日志，用于调试
        print("[BatchingTelemetryClient] Event queued. Current size: \(eventQueue.count)")

        if eventQueue.count >= Config.batchSizeThreshold {
            flushQueue(reason: "Threshold reached")
        }
    }

    // MARK: - Private Logic
    @objc private func handleAppWillResignActive() {
        // 在真实项目中，这里应该将队列持久化到磁盘
        flushQueue(reason: "App resigning active")
    }
    
    private func setupFlushTimer() {
        flushTimer = Timer.scheduledTimer(withTimeInterval: Config.flushInterval, repeats: true) { [weak self] _ in
            self?.flushQueue(reason: "Timer fired")
        }
    }

    private func flushQueue(reason: String) {
        queueLock.lock()
        guard !eventQueue.isEmpty, activeRequest == nil else {
            queueLock.unlock()
            if activeRequest != nil {
                 print("[BatchingTelemetryClient] Flush skipped: another flush is in progress.")
            }
            return
        }

        let batchToProcess = eventQueue
        eventQueue.removeAll()
        queueLock.unlock()

        print("[BatchingTelemetryClient] Flushing queue. Reason: \(reason). Batch size: \(batchToProcess.count)")
        sendBatch(batchToProcess, attempt: 1)
    }

    private func sendBatch(_ batch: [FeedbackEvent], attempt: Int) {
        guard let request = buildRequest(for: batch) else { return }

        activeRequest = urlSession.dataTaskPublisher(for: request)
            .tryMap { data, response -> Data in
                guard let httpResponse = response as? HTTPURLResponse, (200...299).contains(httpResponse.statusCode) else {
                    throw URLError(.badServerResponse)
                }
                return data
            }
            .sink(receiveCompletion: { [weak self] completion in
                guard let self = self else { return }
                
                self.activeRequest = nil // 请求结束，释放引用
                
                if case .failure(let error) = completion {
                    print("[BatchingTelemetryClient] Batch send failed (attempt \(attempt)): \(error.localizedDescription)")
                    self.handleFailedBatch(batch, attempt: attempt)
                } else {
                    print("[BatchingTelemetryClient] Batch sent successfully.")
                }
            }, receiveValue: { _ in })
    }
    
    private func handleFailedBatch(_ batch: [FeedbackEvent], attempt: Int) {
        if attempt < Config.maxRetries {
            let delay = Config.retryBaseDelay * pow(2.0, Double(attempt - 1))
            
            print("[BatchingTelemetryClient] Scheduling retry \(attempt + 1) in \(delay) seconds.")
            
            DispatchQueue.main.asyncAfter(deadline: .now() + delay) { [weak self] in
                self?.sendBatch(batch, attempt: attempt + 1)
            }
        } else {
            print("[BatchingTelemetryClient] Batch failed after max retries. Discarding batch.")
            // 在生产环境中，这里应该将失败的批次写入一个“死信队列”进行后续分析，而不是直接丢弃
        }
    }

    private func buildRequest(for batch: [FeedbackEvent]) -> URLRequest? {
        do {
            let encoder = JSONEncoder()
            encoder.dateEncodingStrategy = .iso8601
            let data = try encoder.encode(batch)

            var request = URLRequest(url: Config.endpoint)
            request.httpMethod = "POST"
            request.setValue("application/json", forHTTPHeaderField: "Content-Type")
            request.httpBody = data
            return request
        } catch {
            print("[BatchingTelemetryClient] Error encoding batch: \(error)")
            // 关键的错误处理：如果编码失败，这批数据基本就无法恢复了
            return nil
        }
    }
}

这个BatchingTelemetryClient包含了生产级代码的关键要素：配置、线程安全、批处理触发条件（数量和时间）、重试逻辑（指数退避）、以及对应用生命周期的响应。这是一个常见的错误点：很多开发者只考虑了数量阈值，而忘记了时间阈值，导致少量事件可能永远驻留在内存中无法上报。

2. 实现核心视图模型

ViewModel是连接视图和数据服务的桥梁，它维护着UI所需的所有状态。

// EmotionFeedbackKit/ViewModel/EmotionFeedbackViewModel.swift

import Foundation
import Combine

class EmotionFeedbackViewModel: ObservableObject {

    enum State {
        case initial // 初始状态
        case loading // 正在提交
        case success(FeedbackType) // 提交成功
        case error(String) // 提交失败
    }
    
    @Published private(set) var state: State = .initial

    private let context: RecommendationContext
    private let telemetryService: FeedbackTelemetryService
    
    // 用于防抖，防止用户快速连续点击
    private var actionDebouncer: AnyCancellable?

    init(context: RecommendationContext, telemetryService: FeedbackTelemetryService) {
        self.context = context
        self.telemetryService = telemetryService
    }

    func registerFeedback(type: FeedbackType, reason: String? = nil) {
        // 防止在loading状态时重复触发
        guard case .loading = state else {
            // 开始处理，进入加载状态
            state = .loading
            
            // 使用0.3秒的防抖，避免用户手抖导致的重复点击
            actionDebouncer = Just(())
                .delay(for: .milliseconds(300), scheduler: DispatchQueue.main)
                .sink { [weak self] in
                    self?.performFeedbackSubmission(type: type, reason: reason)
                }
            return
        }
    }
    
    private func performFeedbackSubmission(type: FeedbackType, reason: String?) {
        let event = FeedbackEvent(feedbackType: type, reason: reason, context: context)
        telemetryService.submit(event: event)

        // 模拟一个网络延迟和成功回调。在真实世界中，上报服务是异步的，
        // UI的成功状态应该几乎是即时的，因为事件已经进入了可靠的队列。
        // 这里的延迟只是为了UI展示效果。
        DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) { [weak self] in
            self?.state = .success(type)
        }
    }
    
    // 单元测试思路：
    // 1. 测试`registerFeedback`调用后，`state`是否正确变为`.loading`。
    // 2. Mock `FeedbackTelemetryService`，验证`submit`方法是否被正确调用，且传入的`FeedbackEvent`内容符合预期。
    // 3. 测试在`.loading`状态下再次调用`registerFeedback`是否会被忽略。
    // 4. 测试成功回调后，`state`是否变为`.success`。
}

3. 构建SwiftUI视图

最后，我们用SwiftUI来构建UI。视图完全由ViewModel的状态驱动。

// EmotionFeedbackKit/Views/EmotionFeedbackView.swift

import SwiftUI

public struct EmotionFeedbackView: View {
    
    @StateObject private var viewModel: EmotionFeedbackViewModel

    // 使用依赖注入，让外部可以传入自定义的上报服务实现
    public init(context: RecommendationContext, telemetryService: FeedbackTelemetryService = BatchingTelemetryClient()) {
        _viewModel = StateObject(wrappedValue: EmotionFeedbackViewModel(context: context, telemetryService: telemetryService))
    }

    public var body: some View {
        HStack(spacing: 12) {
            switch viewModel.state {
            case .initial:
                feedbackButton(type: .like, systemImage: "hand.thumbsup")
                feedbackButton(type: .dislike, systemImage: "hand.thumbsdown")
            
            case .loading:
                ProgressView()
                    .progressViewStyle(CircularProgressViewStyle())
                    .scaleEffect(0.8)
                Text("正在提交...")
                    .font(.caption)
                    .foregroundColor(.secondary)
            
            case .success(let type):
                Image(systemName: type == .like ? "checkmark.circle.fill" : "hand.thumbsdown.fill")
                    .foregroundColor(type == .like ? .green : .orange)
                Text(type == .like ? "感谢反馈" : "将减少此类推荐")
                    .font(.caption)
                    .foregroundColor(.secondary)
            
            case .error(let message):
                Image(systemName: "exclamationmark.triangle.fill")
                    .foregroundColor(.red)
                Text(message)
                    .font(.caption)
                    .foregroundColor(.red)
            }
        }
        .animation(.spring(), value: viewModel.state.hashValue) // 给状态切换增加动画
    }
    
    private func feedbackButton(type: FeedbackType, systemImage: String) -> some View {
        Button(action: {
            viewModel.registerFeedback(type: type)
        }) {
            Image(systemName: systemImage)
                .font(.system(size: 16))
                .foregroundColor(.gray)
                .padding(8)
                .background(Color.gray.opacity(0.15))
                .clipShape(Circle())
        }
        .buttonStyle(PlainButtonStyle())
    }
}

// 为了在预览中方便查看，我们需要一个Equatable实现
extension EmotionFeedbackViewModel.State: Equatable, Hashable {
    public static func == (lhs: EmotionFeedbackViewModel.State, rhs: EmotionFeedbackViewModel.State) -> Bool {
        switch (lhs, rhs) {
        case (.initial, .initial): return true
        case (.loading, .loading): return true
        case (.success(let a), .success(let b)): return a == b
        case (.error(let a), .error(let b)): return a == b
        default: return false
        }
    }
    
    public func hash(into hasher: inout Hasher) {
        switch self {
        case .initial:
            hasher.combine(0)
        case .loading:
            hasher.combine(1)
        case .success(let type):
            hasher.combine(2)
            hasher.combine(type)
        case .error(let message):
            hasher.combine(3)
            hasher.combine(message)
        }
    }
}

4. 集成到主应用

现在，在Feed流的Cell中使用这个Kit变得非常简单和清晰。

// In Main App's Feed View
struct FeedItemView: View {
    let item: FeedItem // 假设这是一个包含推荐信息的数据模型

    var body: some View {
        VStack(alignment: .leading) {
            // ... 显示推荐内容的主体 ...
            Text(item.title)
            Text(item.description)
            
            // ... 其他UI元素 ...

            // 在Cell的末尾集成我们的反馈Kit
            HStack {
                Spacer()
                EmotionFeedbackView(
                    context: RecommendationContext(
                        impressionId: item.id,
                        modelInfo: item.modelInfo,
                        feedPosition: item.positionInFeed
                    )
                )
            }
            .padding(.top, 8)
        }
        .padding()
        .background(Color.white)
        .cornerRadius(12)
        .shadow(radius: 2)
    }
}

通过这样的设计，FeedItemView完全不需要知道任何关于网络请求、状态管理、数据上报的细节。它唯一的职责就是创建EmotionFeedbackView并把上下文信息（RecommendationContext）传递进去。这完美地实现了我们最初设定的“关注点分离”的目标。

最终的架构

整个EmotionFeedbackKit的数据流和组件协作关系可以用下面的图来表示：

graph TD
    subgraph MainApp
        A[FeedItemView] -- instantiates & provides context --> B
    end
    
    subgraph EmotionFeedbackKit
        B[EmotionFeedbackView] -- user interaction --> C[EmotionFeedbackViewModel]
        C -- holds state for --> B
        C -- creates event & calls --> D[FeedbackTelemetryService]
        D -- queues & batches events --> E((BatchingQueue))
        E -- on flush --> F{NetworkClient}
    end

    F -- sends batch request --> G([API Backend])

这个架构彻底解决了我们最初遇到的所有问题：

上下文完整： RecommendationContext确保了每一条反馈都携带着精确的归因信息。
数据可靠： BatchingTelemetryClient的队列和重试机制，极大地提升了数据上报的成功率。
代码清晰、可复用： EmotionFeedbackKit成为了一个独立的、可插拔的单元，任何需要此功能的界面都可以轻松集成。

局限性与未来迭代方向

这个版本的EmotionFeedbackKit已经能在生产环境中解决核心问题，但它并非完美。一个务实的工程师必须清楚当前实现的边界和未来的改进空间。

队列持久化： 当前的事件队列是纯内存的。如果用户在提交反馈后、批次上报前强制关闭应用，这部分数据就会丢失。下一个迭代版本必须将BatchingTelemetryClient中的队列持久化到磁盘，例如使用一个轻量级的数据库（如SQLite或Core Data）来存储待上报事件。应用启动时，需要检查并上报上次未完成的事件。
网络状态感知： 当前的实现没有主动感知网络状态。它会在网络不可用时不断重试，这会消耗不必要的电量。可以引入NWPathMonitor来监听网络连接状态，只有在网络可用时才尝试上报，或者在网络从不可用切换到可用时，主动触发一次队列刷新。
远程配置： Kit内的许多硬编码参数，如批处理大小、上报间隔、重试次数等，都应该通过云端配置中心进行下发。这使得我们可以在不发布新版本应用的情况下，动态调整上报策略以应对不同的服务器负载情况。
更丰富的UI交互： 当前的负反馈只是一个点击。未来可以扩展为点击“不喜欢”后弹出一个包含多个预设原因（如“内容不感兴趣”、“重复看到”）的面板，并将用户选择的原因作为FeedbackEvent中的reason字段一同上报，为模型优化提供更细粒度的信号。

Emotion iOS 开发推荐系统 Kit

基于 GitLab CI/CD 自动化部署支持 mTLS 与死信队列的事件中继服务

2023-10-27 DevOps

Relay GitLab CI/CD Dead Letter Queue mTLS

使用Prometheus与GCP构建高基数WebRTC会话质量的可观测性管道

2023-10-27 可观测性

Kubernetes Prometheus TSDB WebRTC Google Cloud (GCP) Golang