转换每批数据可能很棘手。例如,假设您想使用Z 分数归一化来转换原始数值数据。使用 z 分数归一化需要特征的平均值和标准差。不过,每批转换意味着您只能访问一批数据,而不能访问整个数据集。因此,如果批次差异很大,那么一个批次中 Z 得分为 -2.5 与另一个批次中 Z 得分为 -2.5 的含义并不相同。作为一种权宜解决方法,您的系统可以预先计算整个数据集的平均值和标准差,然后将它们用作模型中的常量。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2024-11-14。"],[[["Feature engineering can be performed before or during model training, each with its own advantages and disadvantages."],["Transforming data before training allows for a one-time transformation of the entire dataset but requires careful recreation of transformations during prediction to avoid training-serving skew."],["Transforming data during training ensures consistency between training and prediction but can increase model latency and complicate batch processing."],["When transforming data during training, considerations such as Z-score normalization across batches with varying distributions need to be addressed."]]],[]]