www.flow.packs.examples.KDDCup2009_Churn.flow Maven / Gradle / Ivy
The newest version!
{
"version": "1.0.0",
"cells": [
{
"type": "md",
"input": "# H2O - KDDCup 2009 Churn Prediction Demo\n\nThis demo is based on the KDDCup 2009 churn prediction challenge (the \"small\" dataset): http://www.sigkdd.org/kdd-cup-2009-customer-relationship-prediction\n\nCustomer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offers the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn).\n\n**Note**: We recommend that you first download the datasets (about 45MB).\n\n#### Pre-split data (40k train / 10k test), and binary labels for churn or not\n* https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/kdd2009/small-churn/kdd_train.csv\n* https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/kdd2009/small-churn/kdd_valid.csv\n\n#### Notes\n\n* Higher AUC values can be obtained with parameter tuning (using mostly defaults here)\n* Please specify the path to the downloaded files (or paste the above URLs directly)"
},
{
"type": "md",
"input": "---\n\nYou can specify location of input data in a separated cell:"
},
{
"type": "cs",
"input": "# Did you download the files?\nhasLocalData = false\n# Point to folder containing them\nlocation = \"./bigdata/laptop/kdd2009/small-churn/\"\n\n# Configure file locations\nif hasLocalData\n trainFile = location + \"kdd_train.csv\"\n validFile = location + \"kdd_valid.csv\"\nelse\n trainFile = \"https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/kdd2009/small-churn/kdd_train.csv\"\n validFile = \"https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/kdd2009/small-churn/kdd_valid.csv\""
},
{
"type": "md",
"input": "\nAdd use variables `trainFile` and `validFile` in cells below:\n"
},
{
"type": "cs",
"input": "parseFiles\n paths: [trainFile]\n destination_frame: \"kdd_train.hex\"\n parse_type: \"CSV\"\n separator: 44\n number_columns: 231\n single_quotes: false\n column_names: [\"churn\",\"Var1\",\"Var2\",\"Var3\",\"Var4\",\"Var5\",\"Var6\",\"Var7\",\"Var8\",\"Var9\",\"Var10\",\"Var11\",\"Var12\",\"Var13\",\"Var14\",\"Var15\",\"Var16\",\"Var17\",\"Var18\",\"Var19\",\"Var20\",\"Var21\",\"Var22\",\"Var23\",\"Var24\",\"Var25\",\"Var26\",\"Var27\",\"Var28\",\"Var29\",\"Var30\",\"Var31\",\"Var32\",\"Var33\",\"Var34\",\"Var35\",\"Var36\",\"Var37\",\"Var38\",\"Var39\",\"Var40\",\"Var41\",\"Var42\",\"Var43\",\"Var44\",\"Var45\",\"Var46\",\"Var47\",\"Var48\",\"Var49\",\"Var50\",\"Var51\",\"Var52\",\"Var53\",\"Var54\",\"Var55\",\"Var56\",\"Var57\",\"Var58\",\"Var59\",\"Var60\",\"Var61\",\"Var62\",\"Var63\",\"Var64\",\"Var65\",\"Var66\",\"Var67\",\"Var68\",\"Var69\",\"Var70\",\"Var71\",\"Var72\",\"Var73\",\"Var74\",\"Var75\",\"Var76\",\"Var77\",\"Var78\",\"Var79\",\"Var80\",\"Var81\",\"Var82\",\"Var83\",\"Var84\",\"Var85\",\"Var86\",\"Var87\",\"Var88\",\"Var89\",\"Var90\",\"Var91\",\"Var92\",\"Var93\",\"Var94\",\"Var95\",\"Var96\",\"Var97\",\"Var98\",\"Var99\",\"Var100\",\"Var101\",\"Var102\",\"Var103\",\"Var104\",\"Var105\",\"Var106\",\"Var107\",\"Var108\",\"Var109\",\"Var110\",\"Var111\",\"Var112\",\"Var113\",\"Var114\",\"Var115\",\"Var116\",\"Var117\",\"Var118\",\"Var119\",\"Var120\",\"Var121\",\"Var122\",\"Var123\",\"Var124\",\"Var125\",\"Var126\",\"Var127\",\"Var128\",\"Var129\",\"Var130\",\"Var131\",\"Var132\",\"Var133\",\"Var134\",\"Var135\",\"Var136\",\"Var137\",\"Var138\",\"Var139\",\"Var140\",\"Var141\",\"Var142\",\"Var143\",\"Var144\",\"Var145\",\"Var146\",\"Var147\",\"Var148\",\"Var149\",\"Var150\",\"Var151\",\"Var152\",\"Var153\",\"Var154\",\"Var155\",\"Var156\",\"Var157\",\"Var158\",\"Var159\",\"Var160\",\"Var161\",\"Var162\",\"Var163\",\"Var164\",\"Var165\",\"Var166\",\"Var167\",\"Var168\",\"Var169\",\"Var170\",\"Var171\",\"Var172\",\"Var173\",\"Var174\",\"Var175\",\"Var176\",\"Var177\",\"Var178\",\"Var179\",\"Var180\",\"Var181\",\"Var182\",\"Var183\",\"Var184\",\"Var185\",\"Var186\",\"Var187\",\"Var188\",\"Var189\",\"Var190\",\"Var191\",\"Var192\",\"Var193\",\"Var194\",\"Var195\",\"Var196\",\"Var197\",\"Var198\",\"Var199\",\"Var200\",\"Var201\",\"Var202\",\"Var203\",\"Var204\",\"Var205\",\"Var206\",\"Var207\",\"Var208\",\"Var209\",\"Var210\",\"Var211\",\"Var212\",\"Var213\",\"Var214\",\"Var215\",\"Var216\",\"Var217\",\"Var218\",\"Var219\",\"Var220\",\"Var221\",\"Var222\",\"Var223\",\"Var224\",\"Var225\",\"Var226\",\"Var227\",\"Var228\",\"Var229\",\"Var230\"]\n column_types: [\"Enum\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Numeric\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Numeric\"]\n delete_on_done: true\n check_header: 1\n chunk_size: 4194304"
},
{
"type": "cs",
"input": "parseFiles\n paths: [validFile]\n destination_frame: \"kdd_valid.hex\"\n parse_type: \"CSV\"\n separator: 44\n number_columns: 231\n single_quotes: false\n column_names: [\"churn\",\"Var1\",\"Var2\",\"Var3\",\"Var4\",\"Var5\",\"Var6\",\"Var7\",\"Var8\",\"Var9\",\"Var10\",\"Var11\",\"Var12\",\"Var13\",\"Var14\",\"Var15\",\"Var16\",\"Var17\",\"Var18\",\"Var19\",\"Var20\",\"Var21\",\"Var22\",\"Var23\",\"Var24\",\"Var25\",\"Var26\",\"Var27\",\"Var28\",\"Var29\",\"Var30\",\"Var31\",\"Var32\",\"Var33\",\"Var34\",\"Var35\",\"Var36\",\"Var37\",\"Var38\",\"Var39\",\"Var40\",\"Var41\",\"Var42\",\"Var43\",\"Var44\",\"Var45\",\"Var46\",\"Var47\",\"Var48\",\"Var49\",\"Var50\",\"Var51\",\"Var52\",\"Var53\",\"Var54\",\"Var55\",\"Var56\",\"Var57\",\"Var58\",\"Var59\",\"Var60\",\"Var61\",\"Var62\",\"Var63\",\"Var64\",\"Var65\",\"Var66\",\"Var67\",\"Var68\",\"Var69\",\"Var70\",\"Var71\",\"Var72\",\"Var73\",\"Var74\",\"Var75\",\"Var76\",\"Var77\",\"Var78\",\"Var79\",\"Var80\",\"Var81\",\"Var82\",\"Var83\",\"Var84\",\"Var85\",\"Var86\",\"Var87\",\"Var88\",\"Var89\",\"Var90\",\"Var91\",\"Var92\",\"Var93\",\"Var94\",\"Var95\",\"Var96\",\"Var97\",\"Var98\",\"Var99\",\"Var100\",\"Var101\",\"Var102\",\"Var103\",\"Var104\",\"Var105\",\"Var106\",\"Var107\",\"Var108\",\"Var109\",\"Var110\",\"Var111\",\"Var112\",\"Var113\",\"Var114\",\"Var115\",\"Var116\",\"Var117\",\"Var118\",\"Var119\",\"Var120\",\"Var121\",\"Var122\",\"Var123\",\"Var124\",\"Var125\",\"Var126\",\"Var127\",\"Var128\",\"Var129\",\"Var130\",\"Var131\",\"Var132\",\"Var133\",\"Var134\",\"Var135\",\"Var136\",\"Var137\",\"Var138\",\"Var139\",\"Var140\",\"Var141\",\"Var142\",\"Var143\",\"Var144\",\"Var145\",\"Var146\",\"Var147\",\"Var148\",\"Var149\",\"Var150\",\"Var151\",\"Var152\",\"Var153\",\"Var154\",\"Var155\",\"Var156\",\"Var157\",\"Var158\",\"Var159\",\"Var160\",\"Var161\",\"Var162\",\"Var163\",\"Var164\",\"Var165\",\"Var166\",\"Var167\",\"Var168\",\"Var169\",\"Var170\",\"Var171\",\"Var172\",\"Var173\",\"Var174\",\"Var175\",\"Var176\",\"Var177\",\"Var178\",\"Var179\",\"Var180\",\"Var181\",\"Var182\",\"Var183\",\"Var184\",\"Var185\",\"Var186\",\"Var187\",\"Var188\",\"Var189\",\"Var190\",\"Var191\",\"Var192\",\"Var193\",\"Var194\",\"Var195\",\"Var196\",\"Var197\",\"Var198\",\"Var199\",\"Var200\",\"Var201\",\"Var202\",\"Var203\",\"Var204\",\"Var205\",\"Var206\",\"Var207\",\"Var208\",\"Var209\",\"Var210\",\"Var211\",\"Var212\",\"Var213\",\"Var214\",\"Var215\",\"Var216\",\"Var217\",\"Var218\",\"Var219\",\"Var220\",\"Var221\",\"Var222\",\"Var223\",\"Var224\",\"Var225\",\"Var226\",\"Var227\",\"Var228\",\"Var229\",\"Var230\"]\n column_types: [\"Enum\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Numeric\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Numeric\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Enum\",\"Numeric\"]\n delete_on_done: true\n check_header: 1\n chunk_size: 4194304"
},
{
"type": "cs",
"input": "buildModel 'gbm', {\"model_id\":\"gbm-model\",\"training_frame\":\"kdd_train.hex\",\"validation_frame\":\"kdd_valid.hex\",\"ignore_const_cols\":true,\"score_each_iteration\":false,\"response_column\":\"churn\",\"ntrees\":50,\"max_depth\":\"3\",\"min_rows\":10,\"nbins\":\"4\",\"nbins_cats\":\"20\",\"r2_stopping\":0.999999,\"learn_rate\":\"0.12\",\"distribution\":\"bernoulli\",\"balance_classes\":false,\"seed\":-1941834077389261000}"
},
{
"type": "cs",
"input": "getModel \"gbm-model\""
}
]
}