์์๋ธ ๋ชจ๋ธ ์ค ํ๋์ธ XG ๋ถ์คํธ(eXtream Gradient Boosting)๋ ์บ๊ธ ์ฌ์ฉ์์๊ฒ ํฐ ์ธ๊ธฐ๋ฅผ ์ป๊ณ ์๋ ๋ชจ๋ธ์ด๋ค.
*์์๋ธ: ์ฌ๋ฌ ๊ฐ์ ํ์ต ์๊ณ ๋ฆฌ์ฆ์ ์ฌ์ฉํด ๋ ์ข์ ์ฑ๋ฅ์ ์ป๋ ๋ฐฉ๋ฒ
์์๋ธ์๋ ๋ฐฐ๊น ๊ณผ ๋ถ์คํ ์ด ์๋ค.
ensemble ์ข ๋ฅ : single(CNN,RNN) bagging boosting
*๋ฐฐ๊น : ์ฌ๋ฌ ๊ฐ์ ํ์ต ์๊ณ ๋ฆฌ์ฆ, ๋ชจ๋ธ์ ํตํด ๊ฐ๊ฐ ๊ฒฐ๊ณผ๋ฅผ ์์ธกํ๊ณ ๋ชจ๋ ๊ฒฐ๊ณผ๋ฅผ ๋๋ฑํ๊ฒ ๋ณด๊ณ ์ทจํฉํด์ ๊ฒฐ๊ณผ๋ฅผ ์ป๋ ๋ฐฉ์
*๋ถ์คํ : ๋ฐฐ๊น ๊ณผ ๋ค๋ฅด๊ฒ ๋ชจ๋ธ์ ๊ฒฐ๊ณผ๋ฅผ ์์ฐจ์ ์ผ๋ก ์ทจํฉ, ๋จ์ํ ํ๋์ฉ ์ทจํฉํ๋ ๋ฐฉ๋ฒ์ด ์๋๋ผ ์ด์ ์๊ณ ๋ฆฌ์ฆ, ๋ชจ๋ธ์ด
ํ์ต ํ ์๋ชป ์์ธกํ ๋ถ๋ถ์ ๊ฐ์ค์น๋ฅผ ์ค์ ๋ค์ ๋ชจ๋ธ๋ก ๊ฐ์ ํ์ตํ๋ ๋ฐฉ์
XG ๋ถ์คํธ๋ ๋ถ์คํ ๊ธฐ๋ฒ ์ค ํธ๋ฆฌ๋ถ์คํ (Tree Boosting) ๊ธฐ๋ฒ์ ํ์ฉํ ๋ชจ๋ธ์ด๋ค.
๋๋คํฌ๋ ์คํธ ๋ชจ๋ธ์ด ์ฌ๋ฌ ๊ฐ์ ์์ฌ ๊ฒฐ์ ํธ๋ฆฌ๋ฅผ ์ฌ์ํด ๊ฒฐ๊ณผ๋ฅผ ํ๊ท ๋ด๋ ๋ฐฉ๋ฒ์ด๋ผ๋ฉด, XGB๋ ๋์ผํ ์๋ฆฌ๋ฅผ ์ ์ฉํด ์ฌ๋ฌ๊ฐ์ ๊ฒฐ๊ณผ๋ฅผ ์ถ์ฒํด ๋์ ์ค๋ต์ ๋ํด ๊ฐ์ค์น๋ฅผ ๋ถ์ฌํ๋ค. ๊ฐ์ค์น๊ฐ ์ ์ฉ๋ ์ค๋ต์ ๋ํด ๊ฐ์ค์น๋ฅผ ๋ถ์ฌํ๋ค. ๊ทธ๋ฆฌ๊ณ ๊ฐ์ค์น๊ฐ ์ ์ฉ๋ ์ค๋ต์ ๋ํด์๋ ๊ด์ฌ์ ๊ฐ์ง๊ณ ์ ๋ต์ด ๋ ์ ์๋๋ก ๊ฒฐ๊ณผ๋ฅผ ๋ง๋ค๊ณ ํด๋น ๊ฒฐ๊ณผ์ ๋ํ ์ค๋ต์ ์ฐพ์ ๊ฐ์ ์์ ์ ๋ฐ๋ณตํ๋ค.
์ต์ข ์ ์ผ๋ก XGB๋ ํธ๋ฆฌ ๋ถ์คํ ๋ฐฉ์์ ๊ฒฝ์ฌ ํ๊ฐ๋ฒ์ ํตํด ์ต์ ํํ๋ ๋ฐฉ๋ฒ์ด๋ค. ๋ํ ์ฐ์ฐ๋์ ์ค์ด๊ธฐ ์ํด ๋ณ๋ ฌ์ฒ๋ฆฌ๋ฅผ ์ฌ์ฉํด ๋น ๋ฅธ ํ์ต์ด ๊ฐ๋ฅํ๋ค.
*๊ฒฝ์ฌํ๊ฐ๋ฒ(Gradient descent): cost(W) = 1/mΣ|Wxi-yi|^2
cost๊ฐ์ด ์ต์๊ฐ ๋๋ ์ง์ ์ ๋ฏธ๋ถ์ ํตํด ์ฐพ๋๋ค.
import xgboost as xgb
train_data = xgb.DMatrix(train_input.sum(axis=1),label=train_label) #XGB ํ์ Matrix๋ก ๋ณํ
eval_data = xgb.DMatrix(eval_input.sum(axis=1),label=eval_label)
data_list = [(train_data,'train'),(eval_data,'valid')]
params = {}
params['objective'] = 'binary:logistic'
params['eval_metric'] = 'rmse'
bst = xgb.train(params,train_data,num_boost_round=1000,data_list,
early_stopping_rounds=10)
# parameter
-
objective [default=reg:squarederror]
-
reg:squarederror: regression with squared loss.
-
reg:squaredlogerror: regression with squared log loss 12[log(pred+1)−log(label+1)]212[log(pred+1)−log(label+1)]2. All input labels are required to be greater than -1. Also, see metric rmsle for possible issue with this objective.
-
reg:logistic: logistic regression
-
reg:pseudohubererror: regression with Pseudo Huber loss, a twice differentiable alternative to absolute loss.
-
binary:logistic: logistic regression for binary classification, output probability
-
binary:logitraw: logistic regression for binary classification, output score before logistic transformation
-
binary:hinge: hinge loss for binary classification. This makes predictions of 0 or 1, rather than producing probabilities.
-
count:poisson –poisson regression for count data, output mean of poisson distribution
-
max_delta_step is set to 0.7 by default in poisson regression (used to safeguard optimization)
-
-
survival:cox: Cox regression for right censored survival time data (negative values are considered right censored). Note that predictions are returned on the hazard ratio scale (i.e., as HR = exp(marginal_prediction) in the proportional hazard function h(t) = h0(t) * HR).
-
survival:aft: Accelerated failure time model for censored survival time data. See Survival Analysis with Accelerated Failure Time for details.
-
aft_loss_distribution: Probabilty Density Function used by survival:aft objective and aft-nloglik metric.
-
multi:softmax: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)
-
multi:softprob: same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata * nclass matrix. The result contains predicted probability of each data point belonging to each class.
-
rank:pairwise: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized
-
rank:ndcg: Use LambdaMART to perform list-wise ranking where Normalized Discounted Cumulative Gain (NDCG) is maximized
-
rank:map: Use LambdaMART to perform list-wise ranking where Mean Average Precision (MAP) is maximized
-
reg:gamma: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be gamma-distributed.
-
reg:tweedie: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be Tweedie-distributed.
-
eval_metric [default according to objective]
-
Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, mean average precision for ranking)
-
User can add multiple evaluation metrics. Python users: remember to pass the metrics in as list of parameters pairs instead of map, so that latter eval_metric won’t override previous one
-
The choices are listed below:
-
rmse: root mean square error
-
rmsle: root mean square log error: 1N[log(pred+1)−log(label+1)]2−−−−−−−−−−−−−−−−−−−−−−−−−−√1N[log(pred+1)−log(label+1)]2. Default metric of reg:squaredlogerror objective. This metric reduces errors generated by outliers in dataset. But because log function is employed, rmsle might output nan when prediction value is less than -1. See reg:squaredlogerror for other requirements.
-
mae: mean absolute error
-
mphe: mean Pseudo Huber error. Default metric of reg:pseudohubererror objective.
-
logloss: negative log-likelihood
-
error: Binary classification error rate. It is calculated as #(wrong cases)/#(all cases). For the predictions, the evaluation will regard the instances with prediction value larger than 0.5 as positive instances, and the others as negative instances.
-
error@t: a different than 0.5 binary classification threshold value could be specified by providing a numerical value through ‘t’.
-
merror: Multiclass classification error rate. It is calculated as #(wrong cases)/#(all cases).
-
mlogloss: Multiclass logloss.
-
auc: Area under the curve. Available for binary classification and learning-to-rank tasks.
-
aucpr: Area under the PR curve. Available for binary classification and learning-to-rank tasks.
-
ndcg@n, map@n: ‘n’ can be assigned as an integer to cut off the top positions in the lists for evaluation.
-
ndcg-, map-, ndcg@n-, map@n-: In XGBoost, NDCG and MAP will evaluate the score of a list without any positive samples as 1. By adding “-” in the evaluation metric XGBoost will evaluate these score as 0 to be consistent under some conditions.
-
poisson-nloglik: negative log-likelihood for Poisson regression
-
gamma-nloglik: negative log-likelihood for gamma regression
-
cox-nloglik: negative partial log-likelihood for Cox proportional hazards regression
-
gamma-deviance: residual deviance for gamma regression
-
tweedie-nloglik: negative log-likelihood for Tweedie regression (at a specified value of the tweedie_variance_power parameter)
-
aft-nloglik: Negative log likelihood of Accelerated Failure Time model. See Survival Analysis with Accelerated Failure Time for details.
-
interval-regression-accuracy: Fraction of data points whose predicted labels fall in the interval-censored labels. Only applicable for interval-censored data. See Survival Analysis with Accelerated Failure Time for details.
-
num_boot_round = 1000 -> epoch ์ 1000
early_stopping_rounds = 10 -> 10 epoch ๋์ ์๋ฌ๊ฐ์ด ๋ณ๋ก ์ค์ง ์์ผ๋ฉด ์กฐ๊ธฐ์ ๋ฉ์ถค
result :
[320] train-rmse:0.38445 valid-rmse:0.42042
[321] train-rmse:0.38431 valid-rmse:0.42039
[322] train-rmse:0.38430 valid-rmse:0.42039
[323] train-rmse:0.38428 valid-rmse:0.42039
[324] train-rmse:0.38421 valid-rmse:0.42040
[325] train-rmse:0.38406 valid-rmse:0.42040
[326] train-rmse:0.38404 valid-rmse:0.42041
[327] train-rmse:0.38399 valid-rmse:0.42040
[328] train-rmse:0.38393 valid-rmse:0.42041
[329] train-rmse:0.38390 valid-rmse:0.42041
[330] train-rmse:0.38377 valid-rmse:0.42042
[331] train-rmse:0.38362 valid-rmse:0.42041
test_data ๋์ผ ์ํ
'๐พ Deep Learning' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
VAE(Variational Autoencoder) (1) (0) | 2021.02.18 |
---|---|
nvidia-smi ์ต์ (0) | 2021.02.16 |
RNN์ ์ด์ฉํ ์ด๋ฏธ์ง ์์ฑ(feat.MNIST) (0) | 2021.02.15 |
[DL] GRU (gated recurrent unit) (0) | 2021.02.10 |
activation ์ข ๋ฅ (0) | 2021.02.10 |