试验结果

在试验结束之前,我们每隔几周都会更新一次您的试验结果。您可以通过点击试验控制面板中的试验名称来访问您的结果。请勿根据早期结果结束试验,因为这些结果可能具有误导性。应等待试验结束,然后再决定要发布哪一版商品描述。

提示: 试验结束后,请务必在“A+ 商品描述管理器”中重新提交获胜商品描述,从而发布该商品描述。

解释结果

我们根据在试验期间收集的数据,计算发布每一条商品描述可能产生的一系列影响。我们会汇总针对试验中所有已注册 ASIN 的结果。我们提供以下几种结果:

  1. 某个版本的商品描述可能更好。例如,如果结果显示版本 A 更好的可能性为 75%,则意味着在我们计算出的可能影响中,发布版本 A 有 75% 的几率会提高销量/销售额。

  2. 对于每一版商品描述,我们会显示: 销量、销售额、转化率、每位唯一身份访问者购买的商品数量以及分配给该版本的样本量。转化率是指在试验中看到 A+ 商品描述并购买了商品的唯一身份访问者所占的百分比。样本量是指查看了每一版商品描述的唯一身份访问者的数量。每位唯一身份访问者购买的商品数量等于销量除以样本量。

  3. 预估年化影响。系统仅会为已完成的试验填充此部分。它显示了发布获胜商品描述版本在接下来的一年内可能带来的销量及销售额增长预估值。对于置信水平较高的获胜商品描述,您会注意到大部分预估影响都比较积极。对于置信水平较低的获胜商品描述,您会注意到“较差案例”的影响可能是负面的。这是因为试验期间效果较差的商品描述实际上仍有可能随着时间的推移效果更好一些。

预估年化影响

要预测一年内的影响,我们会计算获胜商品描述的日均销量增长数,然后乘以 365。这是一个估算值,其中没有考虑季节性、价格变化或其他影响您实际业务的因素;此值仅供参考,我们不能保证任何收益增长。

【可能】列显示的是我们计算的可能结果范围的中间值 (50%)。【最佳案例】和【较差案例】列显示这些结果的 95% 置信区间。

无定性结果

试验结束后可能会显示无定性结果,或说明某个版本的商品描述优于另一个版本的置信水平较低的结果。不过,这些结果仍然很有价值。

以下是导致试验得出无定性结果的一些原因:

  • 您对商品描述所做的更改幅度太小,无法显著改变买家行为

  • 流量不够高,不足以确定置信水平较高的获胜商品描述

  • 您测试的两个版本的商品描述在推动销量方面拥有相似的效果

  • 大多数买家在做出购买决定时并不关心您对商品描述做的更改

在尝试理解无定性结果时,请参考您的试验假设。例如,根据您的更改内容,无定性结果会告诉您某种类型的商品描述不值得投资,因为它不会影响买家行为。或者,您可以获知两种推销商品的方式有同等效果。您可以运行其他试验来验证您在之前测试中的发现。

试验方法

这些有关试验方法的说明可帮助您了解我们如何选择试验胜出版本,实现项目效果;不过,试验并非强制要求。

试验基于个人买家账户。在试验期间,每个看到您的商品描述的买家账户都被视为试验的一部分。买家会随机分配到某个版本的商品描述,只要在试验期间识别到该买家,系统就会为其显示同一项商品描述,不受设备类型或其他因素的影响。样本量不包括无法识别买家的页面的访问量。我们可能会从样本中自动删除某些类型的数据(如统计异常值),以提高结果的准确性。

我们使用贝叶斯分析方法来分析试验结果。这意味着我们会根据模型和实际试验结果构建一个概率分布模式。我们会报告后验概率分布的平均效应值(就商品数量变化而言)以及 95% 置信区间(也称为“可信区间”),并在试验期间根据从开始以来收集的所有试验数据每周更新。获胜处理的置信度是指几率分布中会对商品销售产生积极影响的结果所占的百分比。

要预测一年的影响,我们会计算试验期间每天赢得和失去的处理销售量的平均差值,然后乘以 365。我们根据后验概率分布提供影响力 95% 置信区间。


 亚马逊官网原文详情:

Experiment Results

We’ll update your experiment results every few weeks until it ends. You can access your results by clicking on the experiment name in the experiments dashboard. Don’t end your experiment based on early results, as those can be misleading. Instead, wait for your experiment to end before deciding what content to publish.

Tip: When your experiment ends, make sure to publish the winning content by re-submitting it in the A+ content manager.

Interpreting Results

Based on the data collected during the experiment, we calculate a range of possible impacts of publishing each piece of content. Results are aggregated for all enrolled ASINs in an experiment. We provide a few kinds of results:

  1. Probability that one version of content is better. For example, if we say there is a 75% probability that Version A is better, that means that 75% of the possible impacts that we calculated show a likely positive units/sales lift from publishing Version A.

  2. For each version of content, we show: Units, sales, conversion rate, units sold per unique visitor, and sample size assigned to that version. The conversion rate is the percentage of unique visitors in the experiment who saw the A+ content and made a purchase. Sample size is the count of unique visitors who saw each version of content. Units per unique visitor is units divided by sample size.

  3. Projected one-year impact. This section is only populated for completed experiments. It shows an estimate of the possible incremental units and sales over the next year from publishing the winning version of content. For high-confidence winners, you will notice that most of the projected impacts are positive. For low-confidence winners, you will notice that the ‘worse case’ impact may be negative. This is because there is still a chance that the content that performed worse during the experiment may actually perform better over time.

Projected One-Year Impact

To project one year impact, we calculate the average daily sales increase of the winning content and multiply by 365. This is an estimate which doesn’t take into account seasonality, price changes, or other factors that would affect your business in the real world; it is provided for informational purposes only and we cannot guarantee any incremental benefits.

The Likely column shows the median (the 50th percentile) of the range of possible outcomes we calculated. The Best Case and Worse Case columns show the 95% confidence interval of those outcomes.

Inconclusive Results

An experiment can end with results that are inconclusive, or results that show a low confidence that one version of content is better than another. However, these results can still be valuable.

Here are some reasons why an experiment may have inconclusive results:

  • The change you made to your content was too small to significantly change customer behavior

  • There wasn’t enough traffic to determine the winning content with high confidence

  • The two versions of content you tested were similarly effective in driving sales

  • The change you made to your content isn’t something that most customers care about when making a purchase decision

Refer to your experiment hypothesis when trying to make sense of inconclusive results. For example, depending on what you changed, an inconclusive result can tell you that a certain type of content isn’t worth investing in because it doesn’t affect customer behavior. Or, it can tell you that two ways of merchandising your product are similarly effective. You can run additional experiments to confirm what you’ve learned from your earlier tests.

Experiment Methodology

These notes on experiment methodology may help you understand how we choose an experiment winner and project impact; however, this is not required to run an experiment.

Experiments are based on individual customer accounts. During an experiment, each customer account that sees your content is considered part of the experiment. Customers are randomly assigned to view one version of content will see that content persistently for the duration of the experiment regardless of device type or other factors, as long as the customer can be identified. Visits to your page where a customer cannot be identified are not included in the sample size. We may automatically remove certain types of data from the sample to improve the accuracy of results, such as statistical outliers.

We use a Bayesian approach to analyze experiment results. This means we construct a probability distribution based on a model as well as the actual results of the experiment. We report the mean effect size (in terms of change in units) as well as the 95% confidence interval (also known as a credible interval) of the posterior probability distribution, which is updated weekly during the experiment based on all experiment data collected since the start. The confidence of a winning treatment is the percentage of outcomes in the probability distribution that show a positive unit sales impact.

To project one year impact, we compute the average difference between the winning and losing treatment sales per day for the duration of the experiment so far and multiply by 365. We provide a 95% confidence interval for the impact which is based on the posterior probability distribution.

 文章来源:亚马逊官方网站