TXT

Lengthening Traces to Improve Opportunities for Dynamic Optimization...

By Yolanda Peters,2014-05-27 14:58
7 views 0
Lengthening Traces to Improve Opportunities for Dynamic Optimization...

     本文由0j6nnjjbjf贡

     ppt文档可能在WAP贡贡贡贡贡贡贡贡贡贡贡贡贡贡端体不佳。建您先TXT贡贡贡贡贡贡贡贡贡贡?或下源文件到本机看。

     Lengthening Traces to Improve Opportunities for Dynamic Optimization

     Chuck Zhao, Cristiana Amza, Greg Steffan, University of Toronto Youfeng Wu Intel Research

     Feb. 16, 2007 Interact-12, HPCA

     Intel’s StarDBT Project

     StarDBT

     A Dynamic Binary Translation framework Operates on traces, optimizes hot traces

     Long term goal: Use StarDBT to allow legacy apps to exploit TM support

     (NOT by automatically parallelizing legacy apps) Allow speculative sequential optimizations Use hardware TM’s checkpoint/restore

     Problem: default traces are too small

     TM overheads would overwhelm benefits

     Challenge: lengthening traces can be tricky

     2

     Trace Formation

     basic-block profile A B D E G

     off-trace stub on-trace blocks

     trace profile A

     C

     C E

     B D F G

     F

     Control flow that goes off-trace can be costly

     3

     Trade-offs when Lengthening Traces

     side-exit ratio

     5% 5% 5% A B D F G 5% 5% 5% 5% 100% - 10% = 90% 100% - 25% = 75% A B D F G A B D F G

     Completion ratio: likelihood of execution staying on trace percentage of execution reaching trace tail Tradeoffs: longer traces have more optimization opportunities longer traces have more side-exit branches

     completion ratio

     Sweet spot exits in between, can we find it?

     4

     Our Work So Far (i.e., this talk)

     1.

     Lengthening traces while maintaining completion ratios

     Through unrolling and straightening A characterization of the impact on traces

     length, completion ratio, unroll factor, …

     2.

     Improving optimization opportunities on longer traces

     Improve Local Value Numbering (LVN) hits Measurement of impact on performance is pending

     3.

     Performing on-the-fly actions by DBT system

     Decisions made by instrumenting/sampling code online

     5

     Related Work

     Binary Translation Systems

     Dynamo DynamoRIO PIN StarDBT

     transparent translation x86 legacy code

     Trace Collection and Optimizations

     Java JIT Dynamo, DynamoRIO, Mojo StarDBT

     x86 binary level MRET2 to improve trace formation aggressive trace optimizations

     First full analysis of trace-lengthening issues for DBT systems

     6

     StarDBT Trace Types

     a

     b

     c

     d

     dispatcher

     self type

     other trace type

     elsewhere type

     7

     Lengthening Traces Through Unrolling

     a

     90% 81%

     a

     a

     72.9%

     completion ratio: 90%

     a

     Unrolling increases trace’s length, but reduces completion ratio

     8

     Finding the Sweet-Spot Unroll Factor

     a a a Unroll factor Completion ratio

     1 2 3 … N (10)

     p (0.99) p2 (0.98) p3 (0.97) … p10 (0.904) p11 (0.895)

     ……

     a a

     N (11)

     given porig = 99% and ptarget = 90%

     chosen by system designer

     Traces with 100% completion ratio: set N = 10

     9

     Lengthening Traces Through Straightening

     c b

     b d

     c

     We don’t yet implement/evaluate straightening

     10

     Evaluation

     11

     Distribution of Original Completion Ratios

     100%

     Original Completion Ratios

     90% 80% 70% 60% 50% 40% 30% 20% 10% 0% bzip2 gzip crafty parser vpr mcf

    average

     original completion ratio

     90-100% 80-89% 70-79% 60-69% 50-59% 40-49% 30-39% 20-29% 10-19% 0-9%

     Majority of hot traces have completion ratios in 90%-100%

     12

     Impact of Unrolling on Hot Trace Size

     60

     Average Number of Instructions

     50 40 30 20 10 0

     bzip2 gzip mcf parser vpr

     36% longer

     completion ratio

     original 98% 94% 90%

     crafty average

     Select SPECIntCPU 2000 bmarks with MinneSpec input

     Lengthening increases hot trace size by more than 36%

     13

     How Much are Traces Unrolled?

     2.4

     Target completion ratio

     Average Unroll Factor

     2.2 2 1.8 1.6

     98% 94% 90%

     1.38-1.58x

     1.4 1.2 1

     bzip2 gzip mcf parser vpr crafty average

     Hot traces are unrolled on average by 1.38x or more

     14

     Not unrolled

     Average Completion Ratio After Lengthening

     90 90% 80 80%

     Completion Ratio

     <0.5% ?

     completion ratio

     70 70% 60 60% 50 50% 40 40% 30 30% 20 20% 10% 10 0

     bzip2 gzip mcf parser vpr crafty average

     original 98% 94% 90%

     Lengthening traces reduces completion ratio by < 0.5%

     15

     Impact of Lengthening on Optimizations

     16

     Local Value Numbering (LVN)

     No need to build Control Flow Graph (CFG)

     Partial info

     No need to perform Data Flow Analysis (DFA)

     Expensive, rely on CFG

     Can be arranged into a single-pass scan

     Ease of implementation Relatively light weight algorithm

     Performs three optimizations:

     Common Subexpression Elimination (CSE) Copy Propagation (CP) Dead-Code

    Elimination (DCE) LVN is common in JIT optimizers

     17

     Ex: LVN On a Lengthened Trace

     Original Traces Lengthened Trace Optimized Trace

     … c=a+b d=a e=b f=d+e d=x …

     … c3 = a1 + b2 DCE hit d1 = a1 e2 = b2 f3 = d1 + e2 f3 = c3 d4 = x4 … CSE

    hit

     … c=a+b e=b f=c d=x …

     18

     LVN Hits Improvement (%)

     35 35% 30 30% 25 25% 20 20% 15 15% 10 10% 5 5% 0

     bzip2 gzip parser vpr crafty mcf average

     % Increase in LVN Hits

     target completion ratio

     98% 96% 94% 90%

     10+% more LVN hits are available through lengthening

     19

     Ongoing Work

     Complete DBT Optimization Framework Evaluate speculative optimizations on

    long hot traces with high completion ratios Automatically determine optimal

    transaction granularity Use HTM to support trace-based speculative optimizations

     20

     Control Speculation

     A Compiler Framework for Speculative Analysis and Optimizations: Lin et. al,

    PLDI 03

     ld.s x = [y] cmp

     90+% 10-%

     if(c){ chk.s x, recovery next: … }

     ld x=[y] …

     recovery: ld x=[y] jmp next

     21

     Use HTM to Support Trace-based Speculative Optimizations

     start_tx cmp

     90+% 10-%

     ld x = [y] if(c){ chk x, abort_tx …

     ld x=[y] …

     } commit_tx

     Use longer traces with high completion ratio as tx granularity

     HTM hardware support simplifies speculative optimization

     22

     Conclusion

     Traces can be effectively lengthened

     increase in trace size by 36+% decrease completion ratio by less than 0.5%

     Longer traces provide better opportunities for optimization

     increase in LVN hits by 10%+

     23

     Q+A

     24

     X86 CISIC ISA

     Complete StarDBT Optimization Framework

     code patching won’t work Really need a code generator and IR

     Design + implement a low-level Runtime IR

     close to hardware capture + represent all necessary low-level info easy to convert from/to machine code easy to implement analysis and optimizations

     Starting point

     Dynamo IR LLVM IR GCC RTL …

     25

     StarDBT Overall Structure

     26

     Trace Formation Heuristics

     MRET: Most Recent Execution Tail

     originally proposed by Dynamo Trace head

     loop head (backward branch target) sampling counter reaches a certain threshold

     Trace tail

     satisfy certain trace-tail conditions

     MRET2: 2-pass MRET

     perform 2 independent MRET trace formation intersect traces with common head

     27

     Traces and Hot Traces

     Trace

     MRET2 recognize trace heads Trace tails satisfy certain conditions Blocks in between become a trace

     Hot Trace

     Based on recognized Traces Put in additional software counters

     head: head counter each early-exit branch: off-trace counters sampling: hot-trace’s completion ratio

     28

     29

    TXT“贡”贡贡由文宝下:http://www.mozhua.net/wenkubao

Report this document

For any questions or suggestions please email
cust-service@docsford.com