...
Database
Query Profiling
15 min read
Usingexplainandexplainanalyzeisoneofthewaysofqueryprofiling.Usingthesekeywordsinanyquerygivesyouasummationofalgorithmsusedtoexecutethatparticularquery.Whybother?Becausewhenthetablesgrowinsize,someofthequeriesstarttakingalotoftimeandthislatencygetsaddedupintheAPIresponsetimewhichisevidentontheUI,totheuser.
Optimizationscanbeondifferentlevels.Youcanuseacachingdbtoavoidqueryingyourmasterdatabase,youcannormalizeyourtabletodividethefieldsintoatomiclevel,youcanuseclientsidecaching,youcanavoidrepeatedcallstothedbinthemiddlewaresandcontrollers,youcanimprovethequeriesbyavoidingfetchingthecolumnsthatarenotneededoravoidingsubqueriesetc.ButrightnowIwishtodiscusstheimpactofindexesonqueryoptimizations.
Let’ssayyouhavedonetheseaforementionedthingsandstillwanttoreducethelatencyfurther.Creatinganindextablecanbehelpfulbutlikeeverything,excessindexingisalsoharmfulbecauseinordertoreducethetimecomplexityyoumightincreasethespacecomplexity.Now,itdependsontheindividualcaseoraprojectorthebudget.Butlet’sdrawalinethatwedon'twanttogooverboardwithindexingbutuseitdefinitely.Butfirst,let’sdefinetheplayground.
IthasthesecolumnsandIhavepopulateditwitharound7millionrows.Thetable’ssizeis1.6GBs.
Apartfromthetable’ssize,whatseparatesitfromtherealworldscenarioisitsautoincrementidasaprimarykey,thatfactthati’msavingtheaddressinthesametableasevents,andtherecanbeafewmoretopointoutwhichareescapingmeatthemomentbutthatisnotimportantfornow.It’snotarealworldtablebutemulatingtherealworldtablewillnotcontradictthesepointsbutonlyamplifythem.Startingwithasimplequerywhichfetchesalltheeventshappeninginthecurrentmonth.
It’susingEXPLAINANALYZEwhichwillrunthequeryandgiveusbehindthescenesaswell.It’sfetchingallthecolumnsandhasasimplewhereclausewhichischeckingonlyonecolumnwhichis“date”.
ThisistheoutputofaPostgreSQLEXPLAINANALYZE.Afterabriefglance,wecantellthatithasassignedtwoworkerthreads,it’sperformingaparallelsequentialscanontherowsandit’sfilteringtherowsaccordingtotheconditionthatwehadspecifiedandfinallygatheringthedatafromtheworkerstotellusthat162ksuchrowsexistandtheactualtimeittook.Forlargertables,parallelsequentialscanisusedwhereeachworkerscansthetablesequentiallybuttheseworkersareworkinginparallel.It’snotsuitableforsmallertablesandthisalgorithmcanincreasethelatencyforthem.Theloopsays3whichmeanstwoworkerthreadsandonemainthread.
Eachreturning54096rowswhichifmultipliedby3givesaround162krows.Filterisprettystraightforward,it’sthefilteringconditionbeingusedbytheseworkers.Italsotellshowmanyrowsithasdiscardedduetotheconditionnotmatchingwiththoserows.
Eachreturning54096rowswhichifmultipliedby3givesaround162krows.Filterisprettystraightforward,it’sthefilteringconditionbeingusedbytheseworkers.Italsotellshowmanyrowsithasdiscardedduetotheconditionnotmatchingwiththoserows.
Now,youcansaythatifthisfilterhasremovedthesemanyrowsthenthequeryshouldreturntotalrowsminusthisnumberofrows.Butthat’snotthecase.Theactualrowsreturnedbythequeryisaround162k.Thisfigureofrowsremovedbythefilterdoesnotrepresentthetotalrowsremovedbythefilterfromtheentiretable.Thisistheaveragenumberofrowsremovedbyeachlooporeachworker.Soifitsaysthatithasused3loopsandyoumultiplythisnumberby3togetthetotalrowsremovedandsubtractitfromthetotalrowsofthetable,youwillgetapproximately162krows,whichistheresult.
It’saskingforthesamethingbutinadifferentway.It’susingtheinbuiltdatefunctions.
Thequeryhastakensignificantlymoretimethanbefore.Otherthanthat,thenumberofrowsfilteredoutandreturnedbyeachworkerisslightlydifferentduetothefactthatthesefunctionsmaketheworkersscaneachrowwithoutusingb-treeindexandapplythistransformonitsvalue.Theseinbuiltdatefunctionstransformandmodifythecolumninordertoexecutethequerywhichincreasestheoverheadinexecutiontime.Ifyoukeepincreasingthetable’ssizeandincludeconcurrentquerieshittingthisdatabase,youcanimaginetheresponsetimecanincreasefurther.
Nowletsrunthesetwoqueriesagainonebyonestartingwiththefirstone,theonewithoutthedatefunctions.
Also,ithasusedadifferentscanthistime-parallelbitmapheapscanonthetableandbitmapindexscanontheindexandanadditionalrecheckoftheconditionisaddedinthestepsaswell.WithoutlookingatthisresultIwouldhaveguessedthatitwouldhavetakenlesstimethanbeforesinceitwilluseasequentialorparallelscanontheindexwejustcreated.Duetothehighnumberofexpectedrows,indexonlyscanisnotpreferredasit’sconsideredinefficientinthisscenario.Itperformsabitmapscanontheindexfirstwhichgivestheprobablematchesoftherowsandusingthedatafromthisscanitperformsthebitmapscanontheactualheaptofetchalltherows.
Noticethatthebitmapindexscanisshowingapprox162krows.Now,thepartwhereitsaysheapblocksandindexrecheckisthestepwhereitittellingusthatoutofalltheheapblocksofthetable,itissureabout14701blocksandtheirrowsbutthelossyonesaretheonesforwhichithadtorechecktheirrowsagainstthefilterconditionandduringwhichitdiscarded761546rows.Thispartaddssomeoverhead.Nowlet’srunthesecondqueryagain,theonewiththedatefunctions.
Therestofthedetailsarealsothesameasbeforeexceptthefactthatittookalmostthricetheamountofexecutiontime.Thereasonforthisprobablywastemporarycachechangesandslightlyhigherplanningoverheadduetothecreationoftheindex.WhenIranthisqueryagain,theexecutiontimealsoreducedto1250msliketheinitialscenario.Sooutofall,theonewhichisusingtheindexandusingsimplecomparatorswithdatevaluesseemstobethemostappropriatequeryanditcanfurtherbeoptimizedbyfetchingonlytherequiredcolumninsteadofallthecolumnswhichreducedtheexecutiontimefurtherbyapproximately400msinmycase.
Thisnewindexis258MBsinsizewhichissignificantlymorethantheprevioussinglecolumnindex.Nowirunthesamequeryagainwhichwasfilteringtherowsbasedonstartandenddaterangeusinglogicalcomparators.
Thisisduetotheorderofcolumnsspecifiedwhilecreatingtheindex.So,creatingandusingtheindexhastobeajudiciousdecision.Whenyouhavemorethanonequeriesrunningondifferenttables,youmightwanttocreateindexofeverypossiblecombinationofcolumnsbeingusedbutthisapproachfirstofallisnotscalableandsecondly,indextablesalsotakeupspaceandwhenyourunupdateoralterqueriesthosechangesareperformedontheseindextablesaswell.Thataddstheoverheadintheupdatequeries.So,thefirststepistojustunderstandwhat’sgoingoninthequeriesandthatunderstandingwillpresentyouwiththerightsolution.