课程总结
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
下面简要总结了本课程的学习内容:
- 决策树是一种由一系列条件组成的模型,这些条件以树形结构分层排列。条件分为以下几类:
- 训练决策树涉及在每个节点中搜索最佳条件。分屏器例程使用信息增益或 基尼等指标来确定最佳条件。
- 决策森林是一种由多个决策树组成的模式。决策森林的预测是其决策树预测的汇总。
- 随机森林是一组决策树,其中每个决策树都使用特定的随机噪声进行训练。
- Bagging 是一种技术,其中随机森林中的每个决策树都基于不同的示例子集进行训练。
- 随机森林不需要验证数据集。相反,大多数随机森林使用一种称为“袋外评估”的技术来评估模型质量。out-of-bag-evaluation
- 梯度提升(决策)树是一种通过输入决策树的迭代调整进行训练的决策森林。一个名为收缩的值用于控制梯度提升(决策)树的学习速率和过拟合程度。
参考
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-02-25。
[null,null,["最后更新时间 (UTC):2025-02-25。"],[[["\u003cp\u003eDecision trees utilize conditions organized hierarchically to make predictions, with training focused on finding the optimal condition for each node.\u003c/p\u003e\n"],["\u003cp\u003eDecision forests combine predictions from multiple decision trees, while random forests introduce randomness during training to improve performance.\u003c/p\u003e\n"],["\u003cp\u003eRandom forests employ out-of-bag evaluation for model assessment, eliminating the need for a separate validation dataset.\u003c/p\u003e\n"],["\u003cp\u003eGradient boosted decision trees are iteratively trained with adjustments influenced by shrinkage, balancing learning rate and overfitting potential.\u003c/p\u003e\n"]]],[],null,["# Course Summary\n\n\u003cbr /\u003e\n\nHere is a quick summary of what you learned in the course:\n\n- A [**decision tree**](/machine-learning/glossary#decision-tree) is a model composed of a collection of [**conditions**](/machine-learning/glossary#condition) organized hierarchically in the shape of a tree. Conditions fall into various categories:\n - An [**axis-aligned\n condition**](/machine-learning/glossary#axis-aligned-condition) involves only a single feature. An [**oblique\n condition**](/machine-learning/glossary#oblique-condition) involves multiple features.\n - A [**binary\n condition**](/machine-learning/glossary#binary-condition) has two possible outcomes. A [**non-binary\n condition**](/machine-learning/glossary#non-binary-condition) has more than two possible outcomes.\n- Training a decision tree involves searching for the best condition at each node. The [**splitter**](/machine-learning/glossary#splitter) routine uses metrics such as [**information\n gain**](/machine-learning/glossary#information-gain) or [**Gini**](/machine-learning/glossary#gini-impurity) to determine the best condition.\n- A [**decision forest**](/machine-learning/glossary#decision-forest) is a mode made of multiple decision trees. The prediction of a decision forest is the aggregation of the predictions of its decision trees.\n- A [**random forest**](/machine-learning/glossary#random-forest) is an ensemble of decision trees in which each decision tree is trained with a specific random noise.\n- [**Bagging**](/machine-learning/glossary#bagging) is a technique in which each decision tree in the random forest is trained on a different subset of examples.\n- Random forests do not require a validation dataset. Instead, most random forests use a technique called [**out-of-bag-evaluation**](/machine-learning/glossary#out-of-bag-evaluation) to evaluate the quality of the model.\n- A [**gradient boosted (decision)\n tree**](/machine-learning/glossary#gradient-boosted-decision-trees-gbt) is a type of decision forest trained through iterative adjustments from input decision trees. A value called [**shrinkage**](/machine-learning/glossary#shrinkage) controls the rate at which a gradient boosted (decision) tree learns and the degree to which it could overfit.\n\nReferences\n----------\n\n- [Greedy Function Approximation: A Gradient Boosting\n Machine](https://jerryfriedman.su.domains/ftp/trebst.pdf), J. Friedman.\n- [The Elements of Statistical\n Learning](https://web.stanford.edu/%7Ehastie/Papers/ESLII.pdf), Trevor Hastie. Chapter 10.\n- [Generalized Boosted Models: A guide to the gbm\n package](http://www.saedsayad.com/docs/gbm2.pdf), G. Ridgeway"]]