learning plus: Force-Directed Scheduling with high level synthesis @ perl

早在之前的 post 中 ASAP ALAP scheduling @perl ,DFG @ perl,DFG @ scheduling Algorithm..提過 DFG 在 high level synthesis 的重要性, 底下就寫個範例來實現從 syntax 到 hardware architecture explore 的 synthesis flow.

step1. syntax parser 2 DFG 目前只提供 directed graph 方式, no cycle graph.... ex: e=e+1 為 cycle graph...

#  c = (a+b)>>1;
#  d = w*(a-b);
#  e = d-c-g*c;

my $tt = ['c','=','(','a','+','b',')','>>','1',';'];
my $cc = ['d','=','w','*','(','a','-','b',')',';'];
my $gg = ['e','=','d','-','c','-','g','*','c',';'];
#my $gg = ['e', '=', 'e', '+', '1', ';'];

my $syn  = SysPerl::syntax2DFG->new();
 $syn->read_text($tt);
 $syn->run_text();
 $syn->free();

 $syn->read_text($cc);
 $syn->run_text();
 $syn->free();

 $syn->read_text($gg);
 $syn->run_text();
 $syn->free();

產生sample graph result:

step2. add time constrain 針對不同的 OP 加入 time info

my $constrain_time_weighted_vertices = {
    '+'  => 1,   # add delay 1 unit s
    '-'  => 1,   # sub delay 1 unit s
    '*'  => 5,   # mul
    '/'  => 8,   # div
    '%'  => 8,   # rem
    '>>' => 1,   # rsht
    '<<' => 1,   # lsht
};

如果超出 time constrain 就會建立出 w::@, r::@ 的 Vertex. 分別代表 w::@ 寫到內部的 register 跟 r::@ 讀取內部的 register, 且彼此差一個 clock cycle. ex: >>::0 表示 op(>>),id(0) result

step3. Cstep (cycle step) gen. 建立起每個 cycle step 的 info. step4. add power constrain

#set unit average power consumed
my $constrain_power_weighted_vertices = {
    '+'  => 4,
    '-'  => 4,
    '*'  => 8,
    '/'  => 10,
    '%'  => 10,
    '>>' => 0,
    '<<' => 0,
};

my $con = SysPerl::constrain2DFG->new();
  $con->set_deep_DFG($DFG);

  $con->set_constrain_time_weighted($constrain_time_weighted_vertices);
  $con->set_constrain_power_weighted($constrain_power_weighted_vertices);

  $con->run_constrain_time_weighted_DFG();
  $con->run_constrain_NewDFG();

#   $con->dump_ALUDFG_graphviz_file('alu.dot');
  $con->dump_NewDFG_graphviz_file('con.dot');

step5. run Force-Directed Scheduling && report

my $sch = SysPerl::schedule2DFG->new();
  $sch->set_deep_cons2DFG($con);

  $sch->run_forece_directed_scheduling();
  $sch->report();

results

$VAR1 = {
         'ALU' => {
                    '-::1' => {
                                'begin' => 2,
                                'end' => 2
                              },
                    '+::0' => {
                                'begin' => 1,
                                'end' => 1
                              },
                    '>>::0' => {
                                 'begin' => 1,
                                 'end' => 1
                               },

power for each cycle step

$VAR1 = {
         '1' => 17,
         '2' => 16
       };

future works... 1. cluster register reduce the feedback registers to store the tmp value 2. cluster op 2 ALU block, such as

//hardware ALU block
void iALU_block_1(int a,int n, int *c){
    c = a + b;
}
...


//@ cycle domain
// cycle 1
iALU_block_1(a,b,&c);
iALU_block_2(a,b,&c);

//cycle 2...

3. architecture explore from DFG... c/verilog ... project: https://github.com/funningboy/SOC_c_model/blob/master/Algorithm/Force_Directed_Scheduling/main.pl refs: Force Directed Scheduling for Behavioral Synthesis Modified Force-Directed Scheduling for Peak and Average Power Parallel Algorithms for Force Directed Scheduling of Flattened and [PDF] Scheduling ebook http://ishare.iask.sina.com.cn/f/10389743.html