ForwardThinking:BuildingDeepRandomForests

ForwardThinking:BuildingDeepRandomForests

KevinMiller,ChrisHettinger,JeffreyHumpherys,TylerJarvis,andDavidKartchner
DepartmentofMathematics
BrighamYoungUniversity
Provo,Utah84602
millerk5@byu.edu,hettinger@math.byu.edu,jeffh@math.byu.edu,
jarvis@math.byu.edu,david.kartchner@math.byu.edu
Abstract

Thesuccessofdeepneuralnetworkshasinspiredmanytowonderwhetherotherlearnerscouldbenefitfromdeep,layeredarchitectures.Wepresentageneralframeworkcalledforwardthinkingfordeeplearningthatgeneralizesthearchitecturalflexibilityandsophisticationofdeepneuralnetworkswhilealsoallowingfordifferenttypesoflearningfunctionsinthenetwork,otherthanneurons,andtheabilitytoadaptivelydeepenthenetworkasneededtoimproveresults.Thisisdonebytrainingonelayeratatime,andoncealayeristrained,theinputdataaremappedforwardthroughthelayertocreateanewlearningproblem.Theprocessisthenrepeated,transformingthedatathroughmultiplelayers,oneatatime,renderinganewdataset,whichisexpectedtobebetterbehaved,andonwhichafinaloutputlayercanachievegoodperformance.Inthecasewheretheneuronsofdeepneuralnetsarereplacedwithdecisiontrees,wecalltheresultaForwardThinkingDeepRandomForest(FTDRF).WedemonstrateaproofofconceptbyapplyingFTDRFontheMNISTdataset.Wealsoprovideageneralmathematicalformulation,calledForwardThinkingthatallowsforothertypesofdeeplearningproblemstobeconsidered.

1 Introduction

Classificationandregressiontreesareafastandpopularclassofmethodsforsupervisedlearning.Forexample,randomforestsbreiman_01 ,extremegradientboostedtreesChenG16 ,andconditionaltreescond_trees haveconsistentlyperformedwellinseveralbenchmarkingstudieswherevariousmethodscompeteagainsteachotherStallkampSSI12 ; Statnikov2008 ; comparo .Inrecentyears,however,deepneuralnetworks(DNNs)havebecomeadominantforceinseveralareasofsupervisedlearning,mostnotablyinimage,speech,andnaturallanguagerecognitionproblems,wheredeeplearningmethodsarealsoconsistentlybeatinghumansKrizhevsky ; socher:2013 .Althoughtheuseofmultiplelayersofneuronsina“deep”architecturehasbeenwell-knownformanyyearsLecun98 ,itwasn’tuntilthediscoveryoffeasiblemeansoftrainingviabackpropagationthatneuralnetworksbecamesuccessful.However,DNNsstillsufferfromavarietyofproblems.Inparticular,itisextremelyexpensivecomputationallytousebackpropagationtotrainmultiplelayersofnonlinearactivationfunctionsLeCun:1998 .Thisnotonlyrequireslengthytraining,butalsouseslargequantitiesofmemory,makingthetrainingofmedium-to-largenetworksinfeasibleonasingleCPU.Moreover,DNNsarehighlypronetooverfittingandthusrequirebothlargeamountsoftrainingdataandcarefuluseofregularizationtogeneralizeeffectively.Indeed,thecomputationalresourcesrequiredtofullytrainaDNNareinmanycasesordersofmagnitudemorethanothermachinelearningmethodssuchasdecisiontreemethodswhichperformalmostaswellonmanytasksandevenbetteronothertaskscomparo .Inotherwords,alotofworkisrequiredtogetatbestonlyslightlybetterperformanceusingDNNs.Inspiteofthesedrawbacks,DNNshaveout-performedsimplerstructuresinanumberofmachinelearningtasks,withauthorscitingtheuseof“deep”architecturesasanecessaryelementoftheirsuccessresnets .Bystackingdozensoflayersofweaklearners(neurons),DNNscancapturetheintricaterelationshipsnecessarytoeffectivelysolveawidevarietyofproblems.Accordingly,weproposeageneralizationoftheDNNarchitecturewhereneuronsarereplacedbyotherclassifiers.Inthispaper,weconsidernetworkswhereeachlayerisatypeofrandomforest,withneuronscomposedoftheindividualdecisiontreesandshowhowsuchnetworkscanbequicklytrainedlayer-by-layerinsteadofpayingthehighcomputationalcostoftrainingaDNNallatonce.Randomforestsbreiman_01 ; Murphy:2012 useensemblingofbootstrappeddecisiontreesasweakclassifiers,reportingtheaverageormaximumvalueacrossallofthetrees’outputsforclassificationprobabilitiesandregressionvalues.InLiu_2008 ,Liunotesthatvarietyinindividualweaklearnersisessentialtothesuccessofensemblelearning.Accordingly,weuseacombinationofrandomdecisiontreesandextrarandomtreesGeurts2006 ineachlayertoincreasevarietyandthusimproveperformance.WecreateboththerandomdecisiontreesandextrarandomtreesusingtheimplementationsprovidedinScikit-Learnscikit-learn .ItisimportanttonotethatZhouandFengZhouF17 veryrecentlypostedarelatedideacalledgcForest,wherethelayersofthedeeparchitecturearecomprisedofmultiplerandomforests.Intheirnetwork,theconnectionstosubsequentlayersaretheoutputsofrandomforests,whereasinourpapertheoutputsoftheindividualdecisiontreesarepassedtosubsequentlayersofdecisiontrees.Inotherwords,theypasstheresultsoftherandomforestthroughtothenextlayer(fullrandomforests,eachconsistingofdecisiontrees),whereaswepasstheresultsofindividualdecisiontreesforward.Wegetcomparableresultswithlesstrees,butahighermemoryrequirementgiventhatwearemappingmoredatatothenextlayer.InparticularwiththeMNISTdataset,theyhaveanaccuracyof98.96%andwegetanessentiallyequivalentaccuracyof98.98%.InbothourworkandZhouandFeng’swork,thesedecisiontreenetworkscanbetrainedefficientlywithouttheuseofbackpropagation.Eachlayerremainsstaticoncetrained,andsothetrainingdatacanbepushedthroughtotrainthenextlayer.Hence,thetrainingtimeforamulti-layerforestshouldbemuchfasterthantrainingtimeforatraditionalDNNarchitecture.Wenotethatinacompanionpaperft_nets ,thatwecanalsotrainaDNNinasimilarfashion,withouttheuseofbackpropagation,thusalsospeedingupthetrainingprocess.TheresultsofbothgcForestandourstudyareconvincing,andwebelievethatbothpapersconfirmthevalidityofexploringdeeparchitectureswithdecisiontreesandrandomforests.InSection2,wegiveacarefuldescriptionofthegeneralarchitectureforaforwardthinkingdeepnetwork.InSection3wedescribehowthegeneraltheoryisappliedinthespecificcaseofaForwardingThinkingDeepRandomForest(FTDRF).Detailsrelatingtotheprocessingofdataandtheexperimentalresultsareinthesubsequentsections.

2 Mathematicaldescriptionofforwardthinking

Themainideaofforwardthinkingisthatneuronscanbegeneralizedtoanytypeoflearnerandthen,oncetrained,theinputdataaremappedforwardthroughthelayertocreateanewlearningproblem.Theprocessisthenrepeated,transformingthedatathroughmultiplelayers,oneatatime,renderinganewdataset,whichisexpectedtobebetterbehaved,andonwhichafinaloutputlayercanachievegoodperformance.

Theinputlayer

Thedataaregivenasthesetofinputvaluesfromasetandtheircorrespondingoutputsinaset.Inmanylearningproblems,,whichmeansthattherearereal-valuedfeatures.Iftheinputsareimages,wecanstackthemaslargevectorswhereeachpixelisacomponent.Insomedeeplearningproblems,eachinputisastackofimages.Forexample,colorimagescanberepresentedasthreeseparatemonochromaticimages,orthreeseparatechannelsoftheimage.Forbinaryclassificationproblems,theoutputspacecanbetakentobe.Formulti-classproblemsweoftenset.

Thefirsthiddenlayer

Letbeasetoflearningfunctions,,forsomecodomainwithparameters.Thislayeroflearningfunctions(orlearners)canberegression,classification,orkernelfunctionsandcanbethoughtofasdefiningnewfeatures.Letandtransformtheinputstoaccordingtothemap

Thisgivesanewdataset.Inmanylearningproblems,inwhichcasethenewdomainisahypercube.Itisalsocommonfor,inwhichcaseisthe-dimensionalorthant.Thegoalistochoosetomakethenewdataset“moreseparable,”orbetter-behaved,thanthepreviousdataset.Aswerepeatthisprocessiteratively,thedatashouldbecomeincreasinglybetter-behavedsothatinthefinallayer,asinglelearnercanfinishthejob.

Additionalhiddenlayers

Letbeaset(layer)oflearningfunctions.Thislayerisagaintrainedonthedata.Thiswouldusuallybedoneinthesamemannerasthepreviouslayer,butitneednotbethesame;forexample,ifthenewlayerconsistsofdifferentkindsoflearners,thenthetrainingmethodforthenewlayermightalsoneedtodiffer.Aswiththefirstlayer,theinputsaretransformedtoanewdomainaccordingtothemap

Thisgivesanewdataset,andtheprocessisrepeated.

Finallayer

Afterpassingthedatathroughthelasthiddenlayer,wetrainthefinallayer,whichconsistsofasinglelearningfunctiononthedatasettodeterminetheoutputs,whereisexpectedtobeclosetoforeach.

Remark 1

Whileinthispaperwehaveappliedthemulti-layerarchitectureofneuralnetworkstodecisiontreesinrandomforests,wenotethatthiscanbegeneralizedtoothertypesofclassifiers.WherethedecisiontreesinourarchitectureareanalogoustotheneuronsinaDNN,otherclassifierssuchasSVMs,gradientboostedtrees,etc.,shouldbeabletobesubstitutedforneuronsinasimilarfashion.

3 Forwardthinkingdeeprandomforestarchitecture

InthissectionwedescribethemethodofconstructionforlayersoftheForwardingThinkingDeepRandomForest(FTDRF)architecture.WenotethesimilaritiestotheroutinetermedCascadeForestinZhouF17 andaddresstheseinthissection.

3.1 Multilayerrandomforests

Usingthenotationoftheprevioussection,wehavetrainingdata(inputsandlabels),wherearefeaturevectorsandarethecorrespondinglabels.AnFTDRFconsistsofmultiplelayersofclassifiers,whereeachlayerconsistsofaforest,comprisedofablendofrandomandextrarandomtrees.Theoutputofeachindividualtreeisavectorofclassprobabilities,asdeterminedbythedistributionofclassespresentintheleafnodeintowhichthesampleissorted.Specifically,givenanydecisiontree,eachleafofthetreeisassignedavectorofclassprobabilities,,correspondingtotheproportionoftrainingdataassignedbythetreetotheleafineachclass.Eachlayeristrainedonthedata,andistheresultofpushingtheinputthroughthatlayer.Specifically,foreachinput,theoutputoftreeinlayerisaprobabilityvector.Andtheseareconcatenatedtogetherateachlayer,sothatforeachinput,theoutputoflayerisan-tupleofprobabilityvectors,whereisthenumberoftreesinlayer.Allsuchoutputsforalltreesinthelayerareconcatenatedtogethertobetheoutputofthelayerforthegivensample.Thisisdoneforallofthetrainingdata,hencetransformingthedatatobeofdimension,whereisthenumberofclassesforthetrainingdatasetandisthenumberoftreesinthecurrentlayer.Theoutputsofeachlayerbecometheinputstothenext,untilthedatahavebeenmappedthroughthefinallayer.Thefinalclasspredictionismadebyaveragingalltheclassprobabilityoutputvectorsfromthedecisiontreesin,andpredictingtheclasswiththehighestprobability.Onecould,ofcourse,useanyclassifiertofindanoptimalcombinationoftheweightsforthefinallayer,butwedonotexplorethispossibilityinthispaper.

Figure 1: ForwardThinkingDeepRandomForest(FTDRF)

3.2 gcForestcomparison

UnlikeZhouandFeng’sarchitectureforgcForest,ourdeeparchitectureofdecisiontreesonlyrequiresthepreviouslayer’soutput.InZhouF17 ,eachlayerpassesboththeclassprobabilitiespredictedbytherandomforests(nottheindividualdecisiontrees)andtheoriginaldatatoeachsubsequentlayer.Ourmodel,ontheotherhand,passesonlytheoutputofthepreviouslayerofindividualdecisiontreestothenextlayer,toreducethespatialcomplexityofnetworktrainingandtesting.Moreover,FTDRFseemstoneedfewertreesineachlayer.Forexample,inourFTDRFdescribedinSection5weobtainedresultscomparabletoZhouF17 onMNIST,butweuseonlydecisiontreesineachlayer,whereasZhouF17 uses4randomforestsoftreeseach(ortreesperlayer).Anotherdistinctionisthatourfinalroutineusesinformationgainentropytocalculatenodesplits,whereasgcForestimplementsginiimpurity.Wealsoransometestswithginiimpuritytodeterminenodesplits,butfoundthatentropyusuallyperformedbetter.

3.3 Half-halfrandomforestlayers

Asisstandardinrandomforests,anodesplitinagivendecisiontreeisdeterminedfromarandomsubsetcontainingfeaturesoftheinputdatapassedtothelayer.Inagivenlayer,thecollectionofdecisiontreesrepresentingthelayercontainsbothrandomdecisiontrees,aswellasextrarandomtreestointroducemorevarietyintothelayer.ThisissimilartothelayersofZhouF17 ,whereoftherandomforestsinagivenlayer,ofthemarecompletelyrandomforestsLiu_2008 ,closelyrelatedtoextrarandomforests.Anextrarandomforestincreasestreerandomizationbychoosingarandomsplittingvalueforeachofthefeaturessubsettodeterminethenodesplit.Inourscheme,werandomlyassigntreestobeofthistypebasedonaBernoullidrawof.

3.4 Addinglayers

Anadvantageofforwardthinkingisthatthetotalnumberoflayersisdeterminedbythedata,ratherthanbyahumandesigner.InthecaseofFTDRF,thechoiceofwhethertocreateanewlayerorterminateisdeterminedbyacrossvalidationscheme.Aftereachlayerisconstructed,weevaluatetheaccuracyonaholdoutsampleconsistingofofthetrainingdatatodeterminetherelativegainproducedbythelastaddedlayer.Ifthelayermeaningfullyincreasedthevalidationaccuracy(i.e.,therelativegainisaboveachosenthreshold),thenweproceedandaddanotherlayertotheFTDRF.Oncetherelativegainofanewlayerfallsbelowthethreshold,westopaddingnewlayersandobtainpredictionsviathefinallayerofournetwork.Forournetwork,wechosearelativegainthresholdof.SomeresultsareshowninTable 1,below.Wenotethatthetreesinthelayersofourspecificimplementationherewerenotcreatedusingboosting(e.g.XGBoostChenG16 ),butweexpectthatdoingsocouldbebeneficialandpossiblyleadtoincreasedaccuracy.

4 Preprocessingofimagedata

ThedecisiontreestructureofFTDRFrequiressufficienttrainingdatatoavoidoverfittinginthefirstfewlayers.Thestate-of-the-artalgorithmsfordealingwithimagedatainclassificationusepreprocessingandtransformingtechniques,suchasconvolutions.Accordingly,weexperimentedwithtwoofthesetechniquesforFTDRF:asingle-pixel“wiggle”andmulti-grainedscanning(MGS).

4.1 Single-pixelwiggle

Forthedatasetusedhere,weaugmentedthetrainingdatabyasinglepixel“wiggle”technique.Thatis,foreachtrainingimageintheMNISTtrainingset,weincludecopiesoftheimagesshiftedaroundindiagonaldirections(up-left,up-right,down-left,anddown-right)byonepixel,seeFigure2.ThisdataaugmentationyieldstheresultsseeninTable1.AfurtherwaytoaugmentthefeaturerepresentationoftheimagesispresentedinthefollowingSection4.2,viaaroutinecalledMulti-GrainedScanningZhouF17 .

Figure 2: Singlepixelwigglevisualization

4.2 Multi-grainedscanning(MGS)

InZhouF17 ,aschemesimilartoconvolutionisproposed,termedMulti-GrainedScanning(MGS),whichweimplementedfortheFTDRFarchitecture.WeusetheexactsameprocessthatZhouandFengdointheirMGSschemeZhouF17 soastobeabletocomparetheresultsofourarchitectureinthesubsequentnetworkstructureFTDRF.WeviewthisMGSprocessasapreprocessingtransformationakintotheconvolutionsofconvolutionalneuralnetworks(CNNs),withthebenefitsandstrengthsthatsuchtransformationsprovide.

Figure 3: Multi-grainedscanning(MGS)routine,windowsize=14

InMGS,windowsofasetnumberofsizesareobtainedinsidethetrainingsetimages(fortheMNISTdatasetwindowsizesare,,and).Foragivenwindowsize,thecorrespondingwindowscontainedinsideofalltrainingimagesareusedasatrainingsettoconstructarandomforestandanextrarandomforestwhoseoutputsaretheclassprobabilities.UnlikeourroutineforthebuildingoftheFTDRFlayersthatoutputtheclassprobabilitiesdeterminedbyeachindividualdecisiontreeinthelayer,thisschemeoutputstheclassprobabilitiesdeterminedbythewholerandomforest.Hence,foragivenwindowsize,theoutputoftherandomforestforeachimagewindowisavectoroftheclassprobabilities.Forallsamplesfedthroughtheserandomforests,theoutputsofallimagewindowsareconcatenatedtogethertoproduceafeaturevectorrepresentingclassificationprobabilitiesofeachofthewindows(seeFigure3).Withthewindowsizesspecified,theoutputsofeachoftherandomforestsfortherespectivewindowsizesareallconcatenatedtogether.ThisfeaturevectoristhenewrepresentationofeachgivensamplefedthroughtheMGSprocess.Withthistransformationofthetrainingdata(andsubsequentlythetestingdata),wetraintheFTDRFlayersaspreviouslydescribedinSection3.

5 FTDRFresultsonMNIST

WepresentresultsforanFTDRFontheMNISThandwritingdigitrecognitiondataset,whereeachsampleisablackandwhiteimageofisolateddigits,writtenbydifferentpeople.Thedatasetissplitintoatrainingsetwith(seenotebelow)samplesandtestingsetofsamples.

5.1 Resultswithsingle-pixelwiggle

Foreachtrainingimage,wecreatedmoreimagesviathesingle-pixelwigglingtechniquetoaugmentthesizeofthetrainingdata.ThelayersofFTDRFcontaineddecisiontrees(randomdecisiontreesandextrarandomtrees).Layersweregrownuntilthetherelativegainwaslessthan,totalinglayers.Nodesplitsweredeterminedbycalculatingtheinformationgainentropy.WeciteourresultsandtheresultsofZhouandFengZhouF17 tocompare,astheirarchitectureismostrelevanttoours.Wenote,however,thatZhouandFengdonotaugmentdatainthistest.Theresultsare:

Model #Trees Accuracy
gcForest 4000 97.85%
FTDRF 2000 97.58%
Table 1: MNISTresultswithoutMGS

5.2 ResultswithMGS

Table 2presentstheresultsofourarchitecturecomparedtoZhou’sgcForestZhouF17 ,includingtheMGSpreprocessingroutine.Notethenthatwedonotaugmentthedatasetwiththesinglepixelaugmentationaswedidpreviously.Inthistest,windowsizesof7,9,and14wereusedfortheMGSstep,creatingatotalofrandomforests(randomforestsandextrarandomforests)totransformthedatafortheFTDRFtraining.Then,trainingdatawaspassedthroughtotheFTDRFstep,wherelayersconsistedofdecisiontrees(randomdecisiontreesandextrarandomtrees)butinthisstep,onlylayerswerenecessarytoachievethedesiredrelativevalidationerrorthreshold.Theresultsare:

Model #Trees Accuracy
gcForest 4000 98.96%
FTDRF 2000 98.98%
FTDRF 500 98.89%
Table 2: MNISTresultswithMGS

6 Relatedwork

Aswehavementioned,theworkofZhouandFengZhouF17 issimilartoourwork,andtheirpreprocessingtechniqueofMGSwasadaptedforouruseintesting.OurFTDRFprimarilydiffersfromZhouandFeng’sgcForestinthatgcForestpassestheoutputsofwholerandomforestsconcatenatedontotheoriginaldatatoeachsubsequentlayer,whereaswepassonlytheoutputsoftheindividualtreestosubsequentlayers.ThegcForestalgorithmwasverysuccessfulinavarietyofclassificationsettings,includingimageandsequentialdata(inwhichMGSisapplied),alongwithothernon-sequentialdata(inwhichMGSisnotapplied).Anotherrelatedideaisthatof“stacking”classifiersWolpert92 ; Zhou:2012 .InthecontextofWolpert92 ,anensembleofclassifiersistrainedandthenfurtherimprovementsaremadebyaddingaclassifierorensembleofclassifierstointerpretthebestwaytocombinetheoutputsoftheoriginalensemble’sclassifiers.Intheperspectiveofourdeeparchitectureanditsbuildingprocess,theideaofstackingthereforecouldbecomparedtoa-layerarchitecture,witharelativelysmallsecondlayer.Ourideaproposestocontinuetheprocess,withlargerlayersstackedsimilarlytoaDNN.TheconnectionsbetweenrandomforestandDNNstructurewereexploredinbiau:neuralRF ; deep-neural-decision-forests ; wolf:deep_neural_pre .ThesepapersassertthatrandomforestconstructionbearssimilaritytoDNNconstructionandthatrandomforestscanthusbetransformedintoneuralnetworksandviceversa.Morespecifically,themathematicaldependenciesbetweenDNNnodeshavebeenshowntobesimilartothedependenciesbetweendecisiontreeleafnodesinrandomforests.WhilethesemethodsdrawconnectionsbetweentheconstructionofdecisiontreesinrandomforestsandDNNs,ourworkisfundamentallydifferentintheideaofensemblingdecisiontreestogetherinlayersresemblingaDNNarchitecture.AsexplainedinSection3,ourarchitecturerepresentsmappingdatathroughdifferenthypercubesinhopesofiteratingtowardsmoreeasilyclassifieddata.

Reproducibility

Allpythoncodeusedtoproduceourresultsisavailableinourgithubrepositoryathttps://github.com/tkchris93/ForwardThinking.

Acknowledgments

ThisworkwassupportedinpartbytheNationalScienceFoundation,GrantNumber1323785andtheDefenseThreatReductionAgency,GrantNumberHDRTA1-15-0049.

References

  • [1] GerardBiau,ErwanScornet,andJohannesWelbl. Neuralrandomforests.
  • [2] LeoBreiman. Randomforests. MachineLearning,pages45:5–32,2001.
  • [3] TianqiChenandCarlosGuestrin. Xgboost:Ascalabletreeboostingsystem. CoRR,abs/1603.02754,2016.
  • [4] PierreGeurts,DamienErnst,andLouisWehenkel. Extremelyrandomizedtrees. MachineLearning,63(1):3–42,2006.
  • [5] KaimingHe,XiangyuZhang,ShaoqingRen,andJianSun. Deepresiduallearningforimagerecognition. CoRR,abs/1512.03385,2015.
  • [6] ChrisHettinger,TannerChristensen,BenEhlert,JeffreyHumpherys,TylerJarvis,andSeanWade. Forwardthinking:Buildingandtrainingneuralnetworksonelayeratatime. 2017. Preprint.
  • [7] TorstenHothorn,KurtHornik,andAchimZeileis. Unbiasedrecursivepartitioning:Aconditionalinferenceframework. JournalofComputationalandGraphicalStatistics,15(3):651–674,2006.
  • [8] PeterKontschieder,MadalinaFiterau,AntonioCriminisi,andSamuelRota Bulò . Deepneuraldecisionforests. June2016.
  • [9] AlexKrizhevsky,IlyaSutskever,andGeoffrey EHinton. Imagenetclassificationwithdeepconvolutionalneuralnetworks. InF. Pereira,C. J. C.Burges,L. Bottou,andK. Q.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems25,pages1097–1105.CurranAssociates,Inc.,2012.
  • [10] YannLeCun,LéonBottou,Genevieve B.Orr,andKlaus-RobertMuller. Effiicientbackprop. InNeuralNetworks:TricksoftheTrade,ThisBookisanOutgrowthofa1996NIPSWorkshop,pages9–50,London,UK,UK,1998.Springer-Verlag.
  • [11] YannLecun,LéonBottou,YoshuaBengio,andPatrickHaffner. Gradient-basedlearningappliedtodocumentrecognition. InProceedingsoftheIEEE,pages2278–2324,1998.
  • [12] YannLeCun,CorinnaCortes,andChristopher J.C.Burges. Themnistdatabaseofhandwrittendigits. http://yann.lecun.com/exdb/mnist/. Accessed:2017-05-19.
  • [13] Fei TonyLiu,Kai MingTing,YangYu,andZhi-HuaZhou. Spectrumofvariable-randomtrees. J.Artif.Int.Res.,32(1):355–384,May2008.
  • [14] Kevin P.Murphy. MachineLearning:AProbabilisticPerspective. TheMITPress,2012.
  • [15] F. Pedregosa,G. Varoquaux,A. Gramfort,V. Michel,B. Thirion,O. Grisel,M. Blondel,P. Prettenhofer,R. Weiss,V. Dubourg,J. Vanderplas,A. Passos,D. Cournapeau,M. Brucher,M. Perrot,andE. Duchesnay. Scikit-learn:MachinelearninginPython. JournalofMachineLearningResearch,12:2825–2830,2011.
  • [16] RichardSocher,AlexPerelygin,JeanWu,JasonChuang,Christopher D.Manning,AndrewNg,andChristopherPotts. Recursivedeepmodelsforsemanticcompositionalityoverasentimenttreebank. InProceedingsofthe2013ConferenceonEmpiricalMethodsinNaturalLanguageProcessing,pages1631–1642,Seattle,Washington,USA,October2013.AssociationforComputationalLinguistics.
  • [17] JohannesStallkamp,MarcSchlipsing,JanSalmen,andChristianIgel. Manvs.computer:Benchmarkingmachinelearningalgorithmsfortrafficsignrecognition. NeuralNetworks,32:323–332,2012.
  • [18] AlexanderStatnikov,LilyWang,andConstantin F.Aliferis. Acomprehensivecomparisonofrandomforestsandsupportvectormachinesformicroarray-basedcancerclassification. BMCBioinformatics,9(1):319,2008.
  • [19] ChristianWolf. Randomforestsv.deeplearning. 2016.
  • [20] David H.Wolpert. Stackedgeneralization. NeuralNetworks,5:241–259,1992.
  • [21] Zhi-HuaZhou. EnsembleMethods:FoundationsandAlgorithms. ChapmanandHall/CRC,1stedition,2012.
  • [22] Zhi-HuaZhouandJi Feng. Deepforest:Towardsanalternativetodeepneuralnetworks. CoRR,abs/1702.08835,2017.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
18198
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description