* RANDOM FOREST *
Random forests or random decision forests are an ensemble learning process based on the training set for classification, regression and other tasks that operate by constructing a decision trees or multiple individual trees by the principle of some basic algorithms ensuring class prediction. Random forest is the collective principal of individual trees entrusting different purposes and classification. It is based on a decision tree and multiple random forest trees for regression purpose. Node splitting is a basic principal for the individual tree decisions conducting individual outcomes or predictions based on random subset of features for each trees for solid outcomes instead of any progressive error. The most significant determinant in this process is based on the principal training set resulting the whole forest. It compiles the whole outputs from decision trees after classification and the presents a final outcome.
Basic random forest processing algorithm.
PRODUCTIVITY AND ADVANTAGES:
1. Use of multiple trees reduce the risk of over fitting and error possibilities.
2. Can be processed efficiently for large database.
3. For large data it produces highly accurate predictions.
4. Random forest can maintain accuracy when a large proportion of data is missing.
5. Even the uncorrelated outcomes are great and accurate.
Classification
CHALLENGES:
Bagging or Bootstrap Aggregation is one of the challenges where decision trees are very sensitive to the data they are trained and cause reciprocal errors but for avoiding such circumstances a small change to the training set can result in significantly different tree structures and help the process to avoid bagging.
Regression processes
CONCLUSION:
For this world of finance and investments such uncorrelated models are very indispensable.for initiating such fortune through random forest ensuring some aspects are important.We need features that have at least some predictive power. After all, for good output we need some good input.The trees of the forest and more importantly their predictions need to be uncorrelated (or at least have low correlations with each other). While the algorithm itself via feature randomness tries to engineer these low correlations for us, the features we select and the hyper-parameters we choose will impact the ultimate correlations as well.
Comments
Post a Comment