早在之前的 post 中
ASAP ALAP scheduling @perl ,DFG @ perl,
DFG @ scheduling Algorithm..提過 DFG 在 high level synthesis 的重要性, 底下就寫個範例來實現從 syntax 到 hardware architecture explore 的 synthesis flow.
step1.
syntax parser 2 DFG
目前只提供 directed graph 方式, no cycle graph....
ex: e=e+1 為 cycle graph...
# c = (a+b)>>1;
# d = w*(a-b);
# e = d-c-g*c;
my $tt = ['c','=','(','a','+','b',')','>>','1',';'];
my $cc = ['d','=','w','*','(','a','-','b',')',';'];
my $gg = ['e','=','d','-','c','-','g','*','c',';'];
#my $gg = ['e', '=', 'e', '+', '1', ';'];
my $syn = SysPerl::syntax2DFG->new();
$syn->read_text($tt);
$syn->run_text();
$syn->free();
$syn->read_text($cc);
$syn->run_text();
$syn->free();
$syn->read_text($gg);
$syn->run_text();
$syn->free();
產生sample graph
result:
step2.
add time constrain
針對不同的 OP 加入 time info
my $constrain_time_weighted_vertices = {
'+' => 1, # add delay 1 unit s
'-' => 1, # sub delay 1 unit s
'*' => 5, # mul
'/' => 8, # div
'%' => 8, # rem
'>>' => 1, # rsht
'<<' => 1, # lsht
};
如果超出 time constrain 就會建立出 w::@, r::@ 的 Vertex. 分別代表 w::@ 寫到內部的 register 跟 r::@ 讀取內部的 register, 且彼此差一個 clock cycle.
ex:
>>::0
表示 op(>>),id(0)
result
step3.
Cstep (cycle step) gen.
建立起每個 cycle step 的 info.
step4. add power constrain
#set unit average power consumed
my $constrain_power_weighted_vertices = {
'+' => 4,
'-' => 4,
'*' => 8,
'/' => 10,
'%' => 10,
'>>' => 0,
'<<' => 0,
};
my $con = SysPerl::constrain2DFG->new();
$con->set_deep_DFG($DFG);
$con->set_constrain_time_weighted($constrain_time_weighted_vertices);
$con->set_constrain_power_weighted($constrain_power_weighted_vertices);
$con->run_constrain_time_weighted_DFG();
$con->run_constrain_NewDFG();
# $con->dump_ALUDFG_graphviz_file('alu.dot');
$con->dump_NewDFG_graphviz_file('con.dot');
step5.
run Force-Directed Scheduling && report
my $sch = SysPerl::schedule2DFG->new();
$sch->set_deep_cons2DFG($con);
$sch->run_forece_directed_scheduling();
$sch->report();
results
$VAR1 = {
'ALU' => {
'-::1' => {
'begin' => 2,
'end' => 2
},
'+::0' => {
'begin' => 1,
'end' => 1
},
'>>::0' => {
'begin' => 1,
'end' => 1
},
power for each cycle step
$VAR1 = {
'1' => 17,
'2' => 16
};
future works...
1. cluster register
reduce the feedback registers to store the tmp value
2. cluster op 2 ALU block, such as
//hardware ALU block
void iALU_block_1(int a,int n, int *c){
c = a + b;
}
...
//@ cycle domain
// cycle 1
iALU_block_1(a,b,&c);
iALU_block_2(a,b,&c);
//cycle 2...
3. architecture explore from DFG...
c/verilog ...
project:
https://github.com/funningboy/SOC_c_model/blob/master/Algorithm/Force_Directed_Scheduling/main.pl
refs:
Force Directed Scheduling for Behavioral Synthesis
Modified Force-Directed Scheduling for Peak and Average Power
Parallel Algorithms for Force Directed Scheduling of Flattened and
[PDF]
Scheduling
ebook
http://ishare.iask.sina.com.cn/f/10389743.html
@ Operator-precedence parser
回覆刪除http://en.wikipedia.org/wiki/Operator-precedence_parser