DOC

Optimal

By Rodney Lawson,2014-07-24 00:52
7 views 0
OptimalOptima

    Optimal

2O卷第4期重庆邮电大学学报(自然科学版)Vo1.2ONo.4

    ;20088

    JournalofChongqingUniversityofPostsandTelecommunications(NaturalScienceEdition

    )Aug.2008

    ;Optimalalgorithmsforschedulinglarge

    ;ll??l

    ;scaleapplicationonheterogeneoussystems ;WANGQingxian

    ;(SchoolofComputerScienceandEngineering,UniversityofElectronicScienceandTechn

    ologyofChina,Chengdu610054,P.R.China)

    ;Abstract:ThispaperstudiesoptimalalgorithmsforschedulinglargescaleapplicationonheterogeneoussystemsusingDivis

    ;ibleLoadTheoU.A.1orerealisticandgeneralmodel,i.e.,bothprocessorsandcommunicati

    onlinksmayhavedifferent

    ;speedsandarbitraUstartupcosts,andcommunicationisinnonblocking.1ode,isintroduced.Undersuchenvironment,

    ;thefollowingresultsareobtained:Mathematicmodelandclosedformexpressionsbothfortheprocessingtimeandthe

    ;fractionofloadforeachprocessorarederived;?

    theinfluenceofstartupcostsontheoptimalprocessingtimeisanalyzed; ;?

    foragivenheterogeneoussystemsandalargescalecomputingproblem,optimalalgorithmsareproposed.

    ;Keywords:heterogeneouscomputing;divisibleloadtheory;nonblockingmodeofcommunication;startupcosts;schedu

    ;lingalgorithm.

    ;CLCnumber:TP301.6Documentcode:AArticalID:1673-825X(2008)04-0440-07

    ;1Introduction

    ;Interestinsolvinglargescalecomputationintensive

    ;problemsinanetworkbaseddistributedcomputingen

    ;vironmenthasreceivedmuchmoreattentionrecently. ;Thisincludestheuseofclustersofworkstations,net

    ;worksofworkstationsoragridcomputingenvironment. ;Amajorproblemofsuchapplicationsistofindanopti

    ;malalgorithmforschedulingworkloadsothatthepro

    ;cessingtimeisminimized[1].

    ;Amaincharacterofsuchplatformisheterogeneity. ;However,theproblemofoptimalschedulingoftasks ;onheterogeneoussystem,inthemostgeneralcase,has

;beenproventobeNPhardandanoptimalsolutioncan

    ;befoundonlyafteranexhaustivesearch[1].Inshaqa ;contrastwithgeneraltaskscheduling,undersomecon

    ;ditions,optimalalgorithmsandclosedformformulas

    ;existforschedulingdivisibleload.

    ;2Themodelandrelatedwork

    ;InthisSection,themathematicmodel,alltheter

    ;Receiveddate:2008-0401

    ;Foundationitem:ThisresearchwassupportedbytheNational ;NaturalScienceFoundationofChinaunderGrant(10467007) ;Foundationof”863”(2006AAO1Z414)

    ;minology,definitions,andnotationsthatareused ;throughoutthepaper.Somerelatedworksarealso ;presented.

    ;2.1Problemstated

    ;Inthispaper,wewillstudythefollowingproblem: ;givenalarge?-scaleapplicationandacomputingplat?- ;formofnetworkconnectedprocessors,ourobjectiveis ;tofindataskschedulingalgorithmtodividethework

    ;loadoftheapplicationintoseveralsubtasksandthen

    taskstoexecuteoneachprocessor ;schedulethesesub

    ;inparallel,suchthatthemankspan,i.e.,thecom

    ;pletiontimeoftheapplication,isminimum. ;Letlbetheworkload,andn,thenumberofa

    ;vailableprocessors,assumeo/denotesthefractionof ;theloadwillbeassignedtoprocessorP,Tibethe ;completedtimeofP,andTbethemakespan,then ;theaboveproblemcanbeformattedas

    ;T=min{max

    ;…l2..{Ti)(1)

    ;suchthat

    ;?d=vt_jl’d>0,1?i?n(2)

    ;2.2Themodel

    ;Theprobleminthispaperismodelasdivisibleload. ;Thetargetplatformisaheterogeneoussystem.Thehet

    ;

    ;4

    WANGQing—xian.Optimalalgorithmsforscheduli”glal”gescaleapplicationonheterogeneoussystems?441?

    ;erogeneousnetworksweconsideredareofdifferent ;computingspeedsandcommunicationspeeds. ;2.3Relatedwork

    ;Duetoitssimpleyetrealisticandanalyticaltracta

    ;bility,thedivisibleloadtheoryhasbeenwidelystudied

    ;inthelastseveralyears.Alistofpapersinthisareais ;availableat[2].Compilationofalltheresearchcon

    ;tributionsinDLTuntil1996canbefoundinamono

    ;graphs[3]writtenbyBharadwaj,GhoseandRober

    ;tazzi,andarecentreportsummarizesallthepublished ;contributionstilldateinthisdomain[1,4].Aunified ;theoreticalperspectivethatsynthesizespreviouslypub

    ;lishedresults,severalnovelresults,andopenques

    ;tionsarepresentedin[5].Therefore,weshallnow ;presentaverybriefsurveyonsomeofsignificantcon

    ;tributionsinthisarearelevanttotheproblemaddressed ;inthispaper.

    ;Inalltheresearchsofarinthisdomain.aprinciple ;thatisusedtoderiveoptimalsolutionisasfollows:in ;ordertoobtainanoptimalprocessingtime,

    ;itisneces

    ;saryandsufficientthatalltheprocessorsparticipating ;inthecomputationmuststopcomputingatthesame ;timeinstantly.Althoughtheoptimalityprinciplere

    ;mainsvalidforevenanarbitrarynetworktopology, ;the

    ;optimaltimeperformancedependscruciallyonthese

    ;lectionofapropersubsetoftheavailableprocessors. ;Thus,usingalargersetofnodesmayyieldaninferior ;performancecomparedtoanoptimalsubsetofnodesa

    ;mongwhichtheloadisdistributedaccordingtothe0p

    ;timalityprinciple[3,5].

    ;ThereisawideacceptanceinDLTresearchona

    ;doptingalinearcostmodeloftheprocessorspeedand ;communicationlinkspeedparameters[1,3,4,6].In ;thismodel,thecommunicationtimedelayisassumed ;tobeproportionaltotheamountofloadthatistrans

    ;ferredoverthechannel,andthecomputationtimeis ;proportionaltotheamountofloadassignedtothepro

    ;cessor.

    ;Inthecaseofos=0andgg,whichrepresent

    ;ahomogeneousbusorientedplatformwithoutstartup ;overheadbothforcomputationandcommunication, ;[3]presentsananalyticproofforabusnetworkthat ;foraminimaltimesolutionallprocessorsmuststop ;computingatthesametime.Withthisprinciple,opti

    ;malsolution,togetherwithclosedformexpressionsfor

    ;themakespanTCanbederived.BecauseTdoesnot ;dependonthedistributesequencing,sotheresultinis

;optima1.

    ;Inthecaseofof=s=0anddifferentg,whichre

    ;presentaheterogeneoussingle--leveltreenetworkwith-- ;outstart--upoverheadbothforcomputationandcommu.. ;nicationcosts,theclosedformcanbeobtainedbothfor

    ;andthemakespanT.ButTstronglydependsonthe ;distributionsequencing,andtheprimaryoptimalprin

    ;ciplethatallprocessorsfinishprocessatthesametime ;isnolongervalid[5,7].Beaumontetc.provedthat ;theoptimalorderinthiscaseshouldbefollowedby ;nondecreaseofg[5].

    ;Formostplatforms,thezerostartuptimebecomes

    ;quiteunrealistic[5,7].Inthecaseofs=s,o=o, ;andgg,theinfluenceofoverheadsontheoptimal ;processingtime,andtheeffectofchangingtheload ;distributionsequenceonthetimeperformanceareana

    ;lyzedin[7-9].Anecessaryandsufficientcondition ;fortheexistenceoftheoptimalprocessingtimeisalso ;given,thattheoptimalsequencingshouldbefollowed ;bynondecreaseofto.Beaumontetc.[5]alsoproved ;thatforthecaseofs=0,differentoandg=g,opti

    ;malsequencingshouldbefollowedbynondecreaseof

    ;g×.

    ;Inthecaseofarbitrarilyo,arbitrarilyg., ;arbitrarily

    ;s,andarbitrarily?,schedulingdivisibleworkloadon

    ;starandtreenetworktopologies,bothlinearandaffined ;costmodelsforcommunicationandcomputationare ;studiedin[5].Theresultshowsthatforenoughwork

    ;load,theoptimalsequenceshouldbefollowedbynon

    ;decreaseofg.

    ;Allstudiesaboveareconsideredonlythe”blocking

    ;modeofcommunication”,i.e.,theprocessorscan

    ;startcomputingprocessonlyafteralltheloadfractions ;havebeenreceivedcompletely.Forsomeapplications, ;“blockingmodeofcommunication”isn’tsuitable.In

    ;veryrecently,communicationin”non—blockingmode

    ;ofcommunication”,i.e.oncereceivedafractions0f

    ;itsshare,processorPcanstartcomputingprocess, ;withoutwaitingforalltheloadfractionshavebeenre

    ;

    ;?

    ;442?重庆邮电大学学报(自然科学版)20

    ;ceivedcompletely,hasbeenstudied[6].nonbloc

;kingmodewasfirstintroducedinhomogeneoustree

    ;networkandlinearnetwork,andin[6]someresults ;includesclosedformexpressionfortheprocessingtime ;andoptimalsequenceofschedulinginheterogeneous ;systemwerederived.In[6],areconsideredwithno ;startupoverheads.

    ;Ithasbeenwidelyacceptedthatstartupcostsplays

    ;somesignificantinfluenceonscheduling[5,7-9].In ;thispaper,weaddstartupcosts,bothforcomputation ;andc0mmunication.Themodelsweconsideredaredif- ;ferentfromtheearlyblockingmode1.Comparedwith ;somefewnonblockingmodel,whichonlyfocuseson ;eitherhomogeneoussystemorwithoutstartupcosts,

    ;ourresultsaremoregeneral,theresultsinprevious ;studiesareonlyspecialcase.

    ;3Optimalprinciples

    ;Ithasbeenprovedthatifthereisnostartupcost

     ;bothofcommunicationandcomputing,optimalalgo

    ;rithmsforDLTsatisfiedthefollowingproperties:?All

    ;processorsareinvolvedincomputing;?Allprocessor

    ;completeitstaskatthesametimeinstance:?The

    ;schedulingsequencedoesnotaffectthemakespan; ;Thereisnotimeintervalforthemasterprocessorbe

    ;tweenanytwocommunication;and?Processorstart

    ;computingatonceuponitsworkloadbeingreceived. ;Theaboveresultsareobvious,andtheycanbe ;Dr0vedviareductiontoabsurdity.Basedonthesere

    ;suhs,alinearprogrammingcanbeobtained.Because ;thatisareallinearprogramming,itcanbesolvedin ;polynomialtime.

    ;Itisnecessarytonotethat,withnonzerostartupo

    ;verheads.thereisnoneedtoinvolveallprocessorsin ;computationintheblockingmodeofcommunication ;[10,11].Inthenonblockingmodeofcommunica.

    ;tion,itisalsoeasytoprovethefollowingtheory. ;Theorem1Innon.blockingmodeofcommunica

    ;tion,optimalschedulingforDLTsatisfiedthefollowing ;conditions:

    ;1)Thereisnotimeintervalforthemasterprocessor ;betweenanytwocommunications;

    ;2)Theslavemastersstartcomputingatonceupon ;itsworkloadbeingreceived;

    ;3)Allinvolvedprocessorsstopcomputingatthe ;sametime.

    ;Thisprincipleseemsintuitive,whichcanbeproved ;byreductiontoabsurdity.Infact,ifitisnottrue,the ;processingtimecanbereducedbytransferringsome ;fractionsofloadfrombusyprocessorstoidleproces

    ;sors,whichincontradictiontotheoptimalprocessing ;time.Theprocessofproofistrivialsoweleaveitout. ;4Optimalschedulingforlargescale

    ;applications

    ;InSection4,byassumingthatworkloadsaredis

    ;patchedinafixedsequence.wederivedclosedform

    ;solutionofDLTinnonblockingmodeofcommunica

    ;ti0n.InthisSection.wewilldecidetheoptimalsched

    ;ulingsequences.

    ;Besidethe3conditionspresentedinTheorem1,0p

    ;timalalgorithmfordivisibleloadschedulingOllhetero

    ;geneouscomputingsysteminvolvesproblemsinclu

    ;ding:?Whichprocessorsshouldbeinvolvedinpro.

    Inwhatordersdoesthesubloadbeingdis ;cessing??

    ;tributed?Sequencingplaysanimportantroleindeter

    ;miningtheoptimalschedulingalgorithm.Ifthese

    ;quencingisdetermined,closedformexpressionboth

    ;f0r,theprocessingtimeand0,sharingofprocessor ;Pcanbeobtained.Thusoptimaldistributionproblem ;ofdivisibleloadcanbesolved.

    ;4.1Theproblemofchoosinginvolvedprocessor ;Let’sconsiderthefollowingexample.Theparame—

    ;tersofthesystemsarelistedinTab.1.Thecommuni

    ;cati0nisofnonblockingmode.Iftheamountofwork

    ;loadis20,whatisthemakespan?

    ;Tab.1Asimpleheterogeneoussystem

    ;Becausethereareonly5processorsinthissystem, ;wecancalculatethemakespanviaenumerate.The0p

    ;

    ;4WANGQing

    ~xian:Optimalalgorithmsforschedulinglarge-scaleappllcaliononheuerogeneoussyslems

    ?443?

    ;timalmakespanis11.988,andonly3processori.e., ;P1,P2andp4areneeded,anditsshareofworkloadis ;10.02,4.82,5.16,respectively.

    ;However,bycheckingdeducestepsinSection2, ;onemayfinditistheamountofworkloadthathave ;whichprocessorbeinvolvedincomputing.Forlarge

    ;scaleapplication,wecanwritethefollowingtheorem ;safety.

;Theorem2Tosolvethelargescaleapplicationona

    ;giVensystem,allprocessorsmustbeinvolvedin ;process?

    ;Forexample,intheaboveinstance,thereareonly3 ;processorneededbeinvolvedinprocessifworkloadis ;20.Butiftheamountofworkloadis100,andthe ;schedulingsequenceisfollowedbytheincreasingindex ;order,thenthemakspanis31.6487,andallproces-. ;sorsareinvolvedwithworkload33.81,18.82,6.35, ;26.93.14.09respectively.

    ;Hence,fromhereweassumetheamountofworkload ;islargeenoughsothatallprocessorallbeinvolved. ;Anditisamustforallhighperformancecomputing ;systems.

    ;4.2Optimalsequencesforloaddistributing ;Forclarity,weconsiderthesimplecaseofschedu

    ;lingdivisibleloadonheterogeneoussystemwithtwo ;processors.Thereareonlytwopossiblesequencesfor ;scheduling,asillustratedinFig.1.

    ;,t0

    ;p

    ;..』囊

    ;jI1III

    ;广一(_rr1

    ;I?I

    ;,”r”l+l

    ;II

    ;?_]

    ;Hg.1.Comparisonofthetwopossiblesequences ;Similartomethodin[2],wecomparetheentire ;loadprocessedinagivenperiod.Obviously,fora ;giventimeunits,thelargertheentireloadprocessed, ;thebetterthesequencesare.

    ;Obviously,onlywhentheentireloadmlsatisfies ;thefollowingconditiondoesprocessorP2beinvolvedin ;processing.

    ;Sf+fx.taI>o2+s2+gIx.taI

    ;Wla>坐二(3)lIltl>——()

    ;wi——gi

    ;Wehaveassumedthatcommunicationspeedsare ;fasterthancomputationspeeds,sow1gl>0.ml

    ;>0requireso2+S2-Sl>0.

    ;Iftheamountofworkloadislargeenough, ;thentwo

    ;processorsareallbeinvolvedinprocessing, ;wehave

    ;1=

    ;T(o+s1)

    ;Wecanobtaintheexpressionofa2as

    ;T(0l++o2+s2)T(o1

    ;a==(————————————

    ;w2w1×

    ;+s1)

    ;w,

    ;(4)

    ;×g

    ;(5)

    ;TheamountofworkloadcanbecomputedinTis ;l+2.

    ;Now,letusinterchangetheschedulingsequence ;(seeFig.1(b)).Letdenotesfractionassignedto ;processorPintheinterchangedsequence,0Ialbethe ;wholeworkloadprocessedinthissequences.Usingthe ;samemethodabovementioned,canbeobtained. ;Thedifferencesbetweenalandmlis

    ;Wt.tlVt0Ial=(l+2)(1+2)=

    ;(Tx(g2g1)+o2x(w2g2)+

    ;o1x(g1w

    ;Fromequation

    ;obtained:

    ;(6)

    ;)+s1xgls2xg2)

    ;,thefollowingsolutions

    ;(6)

    ;canbe

    ;?Ifstartupoverheadsbothf0reommunicationand ;computationbeingneglected,i.e.so=0, ;then

    ;.tl

    ;>.tlg1<g2

    ;whichmeanstheoptimalsequencesshouldbefo1. ;1owedbynonincreaseofg,

    ;i.e.,thefasterlink

    ;speedsshouldbeconsideredfirst.

    ;?Forbusnetworks,i.e.0=0,gg,wehave ;ml>01l0(to2to1)g(s2s1)>0

    ;toto

    ;?——

;>

    ;52.s1o

    ;Further,ifthestartupoverheadsofcomputationbe.

    ;ingneglected,i.e.s=0;orifthestartupcostsare

    ;same,i.e.,s=s,wehave

    ;ml

    ;>l1<w2

    ;whichmeansthattheoptima/sequencesandproces-. ;

    ;?

    ;444?重庆邮电大学学报(自然科学版)20

    ;sorsarrangementshouhtw1<w2,i.e.,thefasterpro

    ;cessorshouldbeconsideredfirst.

    ;?Iftheworkloadislargeenough,thisisthemain ;reasonforusingparallelanddistributedcomputingisto ;sharetheheavycomputationalworkamongtheproces

    ;sors.Theworkloadissolargethatcannotbeprocessed

    processorsareneeded. ;byasingleprocessor,butmulti

    ;Then,thefinishingtimeTislongerenough,which ;makesitemTx(g2g1)themostsignificantandoth- ;eritemscanbeneglected.Soforthiscase,theoptimal ;sequenceisgl<g2.

    ;Basedondifferentprocessorconfigurationplatforms, ;i.e.,differenthypothesesofo?g,andw,aseries ;ofresultscanbeobtainedfromEq(6).Wedonot ;discussithereindetail.

    ;Fordifferentschedulingsequence,communication ;timemaybedifference.InFig.1,thedifferencesin ;twosequencesoflinkoccupiedtimeis

    ;(o+g)=

    ;(.l+sI+.22g2.22.11g2)gig2

    ;wiw2

    ;Witharbitrarynetworkspeeds,arbitrarynetwork ;startupcosts,arbitrarycomputationstart-upcosts, ;andarbitraryprocessorspeeds,itisobviousthatthe ;differenceisnonzero,whichwilldelaythenextproces- ;sor’sstarttime.Firstly,weconsidersomeparticular

    ;cases.

    ;Theorem3Forhomogeneoussystems,i.e.,with ;thesamenetworkspeeds,thesamenetworkstartup

    ;costs,thesamecomputationstartupcosts,andthe

    ;sameprocessorspeeds,Optimalschedulingdoesnot ;dependonthedistributesequencing.

    ;ProofSubstitute,o,g,wwiths,o,g,win(6)

    ;andin(7)respectively,bothexpressionsareequalto ;zero,whichmeansthatifweinterchangeanypairP, ;P+1,1??n1,thetimeandtheworkloadsarenot ;change,thusprovetheresult.

    ;Theorem4:Forheterogeneoussystemswithzero

    upcostsbothforcommunicationandcomputation, ;start

    ;arbitrarycommunicationspeedandarbitrarycomputa

    ;tionspeed,inordertoachieveminimumprocessing ;time,thesequenceofloaddistributioninthenonbloc

    ;kingmodeofcommunicationshouldfollowtheorderin ;whichthelinkspeedsdecrease.

    ;ProofThistheoremcanbeeasilyprovedbysubsti

    ;tuteo=0,=0in(6)and(7)respectively.Byin

    ;terchangingP,P+1,1?i?n1,from(7)thelinkoc

    ;cupiedtimedoesnotbeinfluenced.Whilefrom(6) ;?

Report this document

For any questions or suggestions please email
cust-service@docsford.com