2010年12月27日 星期一

ANSI C grammar

http://search.cpan.org/~dconway/Parse-RecDescent-1.965001/lib/Parse/RecDescent.pm http://www.adp-gmbh.ch/perl/rec_descent.html Eliminating Left Recursion in Parse::RecDescent http://www.perlmonks.org/?node_id=153155 http://prlmnks.org/html/553889.html
postfix_expression:
          primary_expression
        | (primary_expression)(s) '[' expression ']'
        | (primary_expression)(s) '(' ')'
        | (primary_expression)(s) '(' argument_expression_list ')'
c grammar @ perl http://cpansearch.perl.org/src/DCONWAY/Parse-RecDescent-1.93/demo/demo_Cgrammar.pl http://search.cpan.org/~dconway/Parse-RecDescent/MANIFEST c grammar @ lex yacc http://www.lysator.liu.se/c/ANSI-C-grammar-l.html http://www.lysator.liu.se/c/ANSI-C-grammar-y.html Free Programming Language Grammars for Compiler Construction http://www.thefreecountry.com/sourcecode/grammars.shtml PS: try 過 demo_Cgrammar.pl 後. 除了發現 grammar 有些地方要改之外其效能也不太理想. 大體上來說是拿 ANSI C 的 lex yacc 來改的, 但是碰到grammar check 時會有 deep recursive 的問題生. 看來還是得回到最原始的方式先用 lex yacc 轉成 dot/XML format 在餵到 tool 內

2010年12月22日 星期三

LLVM case study

compiler design http://gnuu.org/2009/09/18/writing-your-own-toy-compiler/ To visualize the control flow graph, you can use a nifty feature of the LLVM 'opt' tool. If you put this LLVM IR into "t.ll" and run "llvm-as <>viewCFG()" or "F->viewCFGOnly()" (where F is a "Function*") either by inserting actual calls into the code and recompiling or by calling these in the debugger. LLVM has many nice features for visualizing various graphs. Viewing graphs while debugging code LLVM IR is that it requires all basic blocks to be "terminated" with a control flow instruction such as return or branch. This means that all control flow, including fall throughs must be made explicit in the LLVM IR. If you violate this rule, the verifier will emit an error.
% llvm-gcc -O3 -emit-llvm hello.c -c -o hello.bc

The -emit-llvm option can be used with the -S or -c options to emit an LLVM ".ll" or ".bc" file (respectively) for the code. This allows you to use the standard LLVM tools on the bitcode file.

http://llvm.org/docs/Passes.html#basiccg llvm project gen http://llvm.org/docs/Projects.html Refs: 搭建LLVM實驗環境[轉貼] 演講:窮得只剩下 Compiler -- 淺談編譯技術的革命 「身騎 LLVM,過三關:淺談編譯器技術的嶄新應用」簡報上線 拖稿很久的 LLVM 使用文 Hello,LLVM

2010年12月21日 星期二

sensitive list condition gen @ SystemC

最近 SystemC 跟 SystemVerilog 吵的沸沸揚揚.估且不論效能跟支援度.其實說穿了就是要把 Hardware level design flow. 往上轉成 software level design flow to map hardware level. 這樣不僅縮短 architecture map 的時間. 透過 top down 的 design view 來建立 SOC 的 platform.畢竟現在 IC 都要包山包海的 multi cores. 傳統的 ASIC design flow 要做到 SOC platform 所花的 cost 實在太大了.而 ESL 有助於改善 ASIC design flow 所欠缺的部份.用 model / algorithm / scheduling ...做到早期的系統驗證. learning plus: ESL Design Flow learning plus: Image Scalar 4 ESL Flow Golden Model learning plus: Image Scalar 4 ESL Flow HW Algorithm Image Scalar 4 ESL Flow HW c++ code 底下我們用 SystemC 來模擬 Verilog 的寫法,並且證明其實 SystemC 除了有 high level 的 lib 外,它也可以是個很像 hardware 的 description. ps:心中os其實是想說當然是 SystemC 比 SystemVerilog 好摟...XD sample c code
     if(a>20){ a=a-1; }
else if(a<20){ a=a+1; }
module set && sensitive lists gen @ tt.h

#include >systemc.h>
#include >iostream>

SC_MODULE(SAMPLE){

sc_in>sc_uint>32> > a;
sc_out>sc_uint>32> > c;

sc_signal>sc_uint>1> > _idec_[2];
sc_signal>sc_uint>2> > _dec_;

SC_CTOR(SAMPLE){

SC_METHOD(_iproc_idec_0_);
dont_initialize();
sensitive >> a;

SC_METHOD(_iproc_idec_1_);
dont_initialize();
sensitive >> a;

SC_METHOD(_iproc_idec_);
dont_initialize();
sensitive >> _idec_[0];
sensitive >> _idec_[1];

SC_METHOD(_iproc_c_);
dont_initialize();
sensitive >> _dec_;


};

void _iproc_idec_0_();
void _iproc_idec_1_();
void _iproc_idec_();
void _iproc_c_();
};
@ tt.cpp methods
#include "tt.h"

// @verilog
// assign _idec_[0] = (a>20)? 1'b1 : 1'b0;
void SAMPLE::_iproc_idec_0_(){
   _idec_[0] = ( a.read() > 20 )? 1 : 0;
}

void SAMPLE::_iproc_idec_1_(){
   _idec_[1] = ( a.read() < 20 )? 1 : 0;
}

//@verilog
// _dec_ = {_idec_[1],_idec_[0]}; 
void SAMPLE::_iproc_idec_(){
   int _int_idec_1_ = int(_idec_[1].read());
   int _int_idec_0_ = int(_idec_[0].read());
   _dec_ = _int_idec_1_ << 1 | _int_idec_0_;
}

//@verilog
// mux && alu
void SAMPLE::_iproc_c_(){
   switch(int(_dec_.read())){
     case 1 : c = a.read() -1; break;
     case 2 : c = a.read() +1; break;
   }
}
hardware view learning plus: encoder 小技巧....

2010年12月19日 星期日

ILP Scheduling with DVFS constrain @ perl

在之前的 post 中 Force-Directed Scheduling with golden check @ perl 完成了 Syntax2SystemC gen 的部份, 雖然考慮到 cell 的 time delay 對 total avg power 的 optimal. 但這也只是 static power saving 的動作. 但其實在 timing path 上, 可以針對每個 processor element 加入不同的 frequency or power supply 來 inc/dec timing path 上的 delay, 讓 design 能夠更 meet 我們所需要的 constrains learning plus: DVFS emulator SmartReflex™ Power and Performance Management Technologies 底下就針對 DVFS 如何加入到 DFG graph 中的方式做探討. flow chart step1. 產生 DFG graph. step2. 產生 constrain DFG graph @ time/hardware constrains step3. find time critical path. 針對 critical path 上的 vertex 插入 DVFS vertex. 來 balance critical path. 且再最後 arch gen 的時候, 能夠 cluster 在同個 PE 底下. step3. 用 ILP + 標準差 的方式找出在 constrain 下的最佳 min sum(peak power) 的解. step4. explore SystemC
ILP flow
1. Nv is the total number of vertex operations in the sequencing DFG, excluding the
source and sink nodes (NO–OPs).
2. vi is any vertex in the DFG performing certain operations and 1 ≤ i ≤ Nv .
3. FUk is the functional unit of type k.
4. Mk is the maximum number of functional units of type FUk .
5. Costk is cost of functional unit of type FUk may be power, area or delay.
6. CS [i] is the ASAP time stamp for the operation vi .
7. CL [i] is the ALAP time stamp for the operation vi .
8. xi,c is the binary decision variable which is 1 if vertex vi starts in control step c,
else 0.
sample code case for 5 taps fir
#!/usr/bin/perl


use Data::Dumper;
use SysPerl::syntax2DFG;
use SysPerl::constrain2DFG;
use SysPerl::schedule::integer_linear_programming;
#use SysPerl::schedule::force_directed;
use SysPerl::arch::arch2DFG;
use strict;

#===================================
# @step 1 : gen simple DFG graph
# return  : DFG graph
#         : vertex_pre_stack 
#         : vertex_nxt_stack
#===================================
# 
#  y = b0*x0 + b1*x1 + b2*x2 + b3*x3 + b4*x4 + b5*x5;

my $tt = ['y','=','b0','*','x0','+',
                 ,'b1','*','x1','+',
                 ,'b2','*','x2','+',
                 ,'b3','*','x3','+',
                 ,'b4','*','x4','+',
                 ,'b5','*','x5',';'];

my $syn  = SysPerl::syntax2DFG->new();
   $syn->read_text($tt);
   $syn->run_text();
   $syn->free();

#   my $graph = $syn->get_deep_copy_graph();  
 
# remove tmp_reg && feedback assign 
# ex : d=c;
#      e=d+1;
    $syn->run_updt_DFG();

# get all graph && DFG flow 2 schedule 
my $DFG = $syn->get_deep_copy_DFG();

# dump graph as dot file    
  $syn->dump_DFG_graphviz_file('syn2DFG.dot');
  $syn->free();

#=====================================
# @step2. : 
# flow 1  : insert time weighted constrain 2 simple DFG graph and
#           gen Cstep graph(cycle step grpah)
# flow 2  : insert average power weighted constrain 2 Cstep and gen the force vale 
#           @ force directed scheduling
# return :  cycle graph
#=====================================
# set unit time wait delay
my $constrain_time_weighted_vertices = { 
     '+'  => 1,   # add delay 1 unit s
     '-'  => 1,   # sub delay 1 unit s
     '*'  => 1,   # mul
     '/'  => 1,   # div
     '%'  => 1,   # rem
     '>>' => 1,   # rsht
     '<<' => 1,   # lsht
};

#set unit average power consumed
my $constrain_power_weighted_vertices = {
     '+'  => 1.54,
     '-'  => 1,
     '*'  => 6.7,
     '/'  => 1,
     '%'  => 1,
     '>>' => 1,
     '<<' => 1,
};

my $con = SysPerl::constrain2DFG->new();
   $con->set_deep_DFG($DFG);

   $con->set_constrain_time_weighted($constrain_time_weighted_vertices);
   $con->set_constrain_power_weighted($constrain_power_weighted_vertices);

   $con->run_constrain_time_weighted_DFG();
   $con->run_constrain_NewDFG();
 
#   $con->dump_ALUDFG_graphviz_file('alu.dot');
   $con->dump_NewDFG_graphviz_file('con.dot');

#=============================
# schedule && cluster
#=============================
#my $sch = SysPerl::schedule::force_directed->new();
#   $sch->set_deep_cons2DFG($con);
#
#   $sch->run_forece_directed_scheduling();
#   $sch->report();

my $pe_number_constrain = {
     '+'  => 2,   # add numbers constrain for each time step
     '-'  => 2,   # sub 
     '*'  => 2,   # mul
     '/'  => 2,   # div
     '%'  => 2,   # rem
     '>>' => 2,   # rsht
     '<<' => 2,   # lsht
};

my $sch = SysPerl::schedule::integer_linear_programming->new();
   $sch->set_deep_cons2DFG($con);

   $sch->set_pe_number_constrain($pe_number_constrain);
   $sch->run_integer_linear_programming_scheduling();
   $sch->report();

#=============================
# explore hardware
#=============================  
my $arc = SysPerl::arch::arch2DFG->new();
   $arc->set_deep_sched2arch($sch);
   $arc->run_ALU_cluster();
   $arc->run_explore_SystemC();
project: https://github.com/funningboy/hg_lvl_syn/blob/master/main_ILP.pl Ref : 以多重電位操作之VLIW架構下運用ILP為基礎之低功率排程演算法

2010年12月17日 星期五

Force-Directed Scheduling with golden check @ perl

在之前的 post Force-Directed Scheduling with high level synthesis @ perl 中已經完成了 Force-Directed scheduling 核心部份.而這邊主要是 architecture gen 跟 SystemC map. 主要概念如下圖. architecture (CTL) 會根據每個 cycle step 來決定哪個 processor element 要 work. 等全部運算完後會把 DON 拉成 true 給 test bench. 且在下個 cycle 把算完的 data 讀入. 最後再跟 golden model 比對,做 function check.
# @ author: funningboy
# @ email : funningboy@gmail.com
#     syntax2SystemC  flow

use Data::Dumper;
use SysPerl::syntax2DFG;
use SysPerl::constrain2DFG;
use SysPerl::schedule2DFG;
use SysPerl::arch2DFG;
use strict;

#===================================
# @step 1 : gen simple DFG graph
# return  : DFG graph
#         : vertex_pre_stack
#         : vertex_nxt_stack
#===================================
#
#  c = (a+b)>>1;
#  d = w*(a-b);
#  e = d-c-g*c;

my $tt = ['c','=','(','a','+','b',')','>>','1',';'];
my $cc = ['d','=','w','*','(','a','-','b',')',';'];
my $gg = ['e','=','d','-','c','-','g','*','c',';'];
#my $gg = ['e', '=', 'e', '+', '1', ';'];

my $syn  = SysPerl::syntax2DFG->new();
  $syn->read_text($tt);
  $syn->run_text();
  $syn->free();

  $syn->read_text($cc);
  $syn->run_text();
  $syn->free();

  $syn->read_text($gg);
  $syn->run_text();
  $syn->free();

#   my $graph = $syn->get_deep_copy_graph(); 

# remove tmp_reg && feedback assign
# ex : d=c;
#      e=d+1;
   $syn->run_updt_DFG();

# get all graph && DFG flow 2 schedule
my $DFG = $syn->get_deep_copy_DFG();

# dump graph as dot file   
 $syn->dump_DFG_graphviz_file('syn2DFG.dot');
 $syn->free();

#=====================================
# @step2. :
# flow 1  : insert time weighted constrain 2 simple DFG graph and
#           gen Cstep graph(cycle step grpah)
# flow 2  : insert average power weighted constrain 2 Cstep and gen the force vale
#           @ force directed scheduling
# return :  cycle graph
#=====================================
# set unit time wait delay
my $constrain_time_weighted_vertices = {
    '+'  => 1,   # add delay 1 unit s
    '-'  => 1,   # sub delay 1 unit s
    '*'  => 2,   # mul
    '/'  => 2,   # div
    '%'  => 2,   # rem
    '>>' => 1,   # rsht
    '<<' => 1,   # lsht
};

#set unit average power consumed
my $constrain_power_weighted_vertices = {
    '+'  => 4,
    '-'  => 4,
    '*'  => 8,
    '/'  => 10,
    '%'  => 10,
    '>>' => 1,
    '<<' => 1,
};

my $con = SysPerl::constrain2DFG->new();
  $con->set_deep_DFG($DFG);

  $con->set_constrain_time_weighted($constrain_time_weighted_vertices);
  $con->set_constrain_power_weighted($constrain_power_weighted_vertices);

  $con->run_constrain_time_weighted_DFG();
  $con->run_constrain_NewDFG();

#   $con->dump_ALUDFG_graphviz_file('alu.dot');
  $con->dump_NewDFG_graphviz_file('con.dot');

#=============================
# schedule && cluster
#=============================
my $sch = SysPerl::schedule2DFG->new();
  $sch->set_deep_cons2DFG($con);

  $sch->run_forece_directed_scheduling();
  $sch->report();

#=============================
# explore hardware
#============================= 
my $arc = SysPerl::arch2DFG->new();
  $arc->set_deep_sched2arch($sch);
  $arc->run_ALU_cluster();
  $arc->run_explore_SystemC();
Q: SystemC compile? http://funningboy.blogspot.com/2010/09/risc-cpu-systemc.html project https://github.com/funningboy/SOC_c_model/blob/master/Algorithm/Force_Directed_Scheduling/main.pl ps: future works 1.pipeline or parallel detected. 2.bus interface. 3.memory map location 4.processor element sharing

2010年12月14日 星期二

perl delete array

在 array function 上. 除了用 push, pop 來改變 array 的 pointer外. 也可以用 delete 來清除 array 內部的 data, 但這樣會遇到一個問題. 就是 delete 完後 array 不會 shit 到最新的位置, 所以在做查找時候會出現 undef. 範例. 假設我們 delete 了 dd[0] 的 data. 而下次我們要取得 dd[0] 時,應該為 'c' 才對. 但是現在卻是 undef...xd
use Data::Dumper;
use strict;

my $dd = ['d',
          'c',
          'x',
          'e'];

delete $dd->[0];

print Dumper($dd);                   
result
$VAR1 = [
          undef,
          'c',
          'x',
          'e'
        ];
solution 建立個新的 array 之後再 map 過去...XD.感覺是個很笨的方式.. 不然也可以用 "splice" / "join + split" ...
use Data::Dumper;
use strict;

my $dd = ['d',
          'c',
          'x',
          'e'];

delete $dd->[0];

my $a=[];

foreach my $k (@{$dd}){
  if($k){
   push(@{$a},$k);
 }
}

@{$dd} = @{$a};

print Dumper($dd);
ex: join + split
use Data::Dumper;
use strict;

my $dd = ['d',
          'c',
          'x',
          'e'];

delete $dd->[0];

my $st = join(' ',@{$dd});
@{$dd} = split(' ',$st);

print Dumper($dd);

perl pointer ....

最近被 perl 的 pointer 搞得昏天暗地. 原本以為 $a = $b; pointer 可以指到一個新的記憶體位置並把 $b 的內容 copy 過去. 但是發現卻不是如此... 而是 $a = $b 共用同個記憶體位置. 所以改變 $a 的同時 $b 也會改變. 底下的範例 $arr_e = $arr_b 的 內容, 但是我們改變 $arr_e 的時候,也同時改變了 $arr_b 的記憶體位置.
my $arr_b = ['d','c','x','e'];

my $arr_e = $arr_b;

push (@{$arr_e},'s');

print Dumper($arr_b);
result
$VAR1 = [
         'd',
         'c',
         'x',
         'e',
         's'
       ];

除非我們用 value copy 的方式. 把 $arr_b pointer 內的 data 根據 type 建立起一個相同 type 的 buffer 來儲存. 這樣 pointer 會產生個新的 pointer 位置.
my $arr_b = ['d','c','x','e'];

my @arr_e = @{$arr_b};

push (@arr_e,'s');

print Dumper($arr_b);
result
$VAR1 = [
         'd',
         'c',
         'x',
         'e',
       ];

2010年12月6日 星期一

Force-Directed Scheduling with high level synthesis @ perl

早在之前的 post 中 ASAP ALAP scheduling @perl ,DFG @ perl,DFG @ scheduling Algorithm..提過 DFG 在 high level synthesis 的重要性, 底下就寫個範例來實現從 syntax 到 hardware architecture explore 的 synthesis flow. step1. syntax parser 2 DFG 目前只提供 directed graph 方式, no cycle graph.... ex: e=e+1 為 cycle graph...
#  c = (a+b)>>1;
#  d = w*(a-b);
#  e = d-c-g*c;

my $tt = ['c','=','(','a','+','b',')','>>','1',';'];
my $cc = ['d','=','w','*','(','a','-','b',')',';'];
my $gg = ['e','=','d','-','c','-','g','*','c',';'];
#my $gg = ['e', '=', 'e', '+', '1', ';'];

my $syn  = SysPerl::syntax2DFG->new();
 $syn->read_text($tt);
 $syn->run_text();
 $syn->free();

 $syn->read_text($cc);
 $syn->run_text();
 $syn->free();

 $syn->read_text($gg);
 $syn->run_text();
 $syn->free();

產生sample graph result: step2. add time constrain 針對不同的 OP 加入 time info
my $constrain_time_weighted_vertices = {
    '+'  => 1,   # add delay 1 unit s
    '-'  => 1,   # sub delay 1 unit s
    '*'  => 5,   # mul
    '/'  => 8,   # div
    '%'  => 8,   # rem
    '>>' => 1,   # rsht
    '<<' => 1,   # lsht
};
如果超出 time constrain 就會建立出 w::@, r::@ 的 Vertex. 分別代表 w::@ 寫到內部的 register 跟 r::@ 讀取內部的 register, 且彼此差一個 clock cycle. ex: >>::0 表示 op(>>),id(0) result step3. Cstep (cycle step) gen. 建立起每個 cycle step 的 info. step4. add power constrain
#set unit average power consumed
my $constrain_power_weighted_vertices = {
    '+'  => 4,
    '-'  => 4,
    '*'  => 8,
    '/'  => 10,
    '%'  => 10,
    '>>' => 0,
    '<<' => 0,
};

my $con = SysPerl::constrain2DFG->new();
  $con->set_deep_DFG($DFG);

  $con->set_constrain_time_weighted($constrain_time_weighted_vertices);
  $con->set_constrain_power_weighted($constrain_power_weighted_vertices);

  $con->run_constrain_time_weighted_DFG();
  $con->run_constrain_NewDFG();

#   $con->dump_ALUDFG_graphviz_file('alu.dot');
  $con->dump_NewDFG_graphviz_file('con.dot');

step5. run Force-Directed Scheduling && report
my $sch = SysPerl::schedule2DFG->new();
  $sch->set_deep_cons2DFG($con);

  $sch->run_forece_directed_scheduling();
  $sch->report();
results
$VAR1 = {
         'ALU' => {
                    '-::1' => {
                                'begin' => 2,
                                'end' => 2
                              },
                    '+::0' => {
                                'begin' => 1,
                                'end' => 1
                              },
                    '>>::0' => {
                                 'begin' => 1,
                                 'end' => 1
                               },

power for each cycle step
$VAR1 = {
         '1' => 17,
         '2' => 16
       };
future works... 1. cluster register reduce the feedback registers to store the tmp value 2. cluster op 2 ALU block, such as
//hardware ALU block
void iALU_block_1(int a,int n, int *c){
    c = a + b;
}
...


//@ cycle domain
// cycle 1
iALU_block_1(a,b,&c);
iALU_block_2(a,b,&c);

//cycle 2...
3. architecture explore from DFG... c/verilog ... project: https://github.com/funningboy/SOC_c_model/blob/master/Algorithm/Force_Directed_Scheduling/main.pl refs: Force Directed Scheduling for Behavioral Synthesis Modified Force-Directed Scheduling for Peak and Average Power Parallel Algorithms for Force Directed Scheduling of Flattened and [PDF] Scheduling ebook http://ishare.iask.sina.com.cn/f/10389743.html

2010年12月3日 星期五

ASAP ALAP scheduling @perl

利用 ASAP, ALAP 做 scheduling 找出每個 Vertex 的 tolerance range. 我們就可以利用這樣的 info 來減少 peak power 的產生. 範例如下圖所示 constrain 1. 在不影響每個 Vertex 的先後順序下, 用 ASAP 算出 begin 值. 用 ALAP 算出 end 值. 2. 當 begin==end 時,為 critical path, 表示這個 vertex 不能被移動, begin!=end 表示為 vertex 可以移動的位置. example : vertex 1 : begin time 2 end time 4 vertex 2 : begin time 3 end time 3 vertex 3 : begin time 2 end time 4 由此可知, vertex 2 的位置為固定值, 只能靠移動 Vertex1, Vertex3 來減少 peak power. worst case best case ps: 這可以用 short path algorithm 來算出最小值 example : power_vertex1(x) 2<=x<=4; power_vertex2(y) 3<=y<=3; power_vertex3(z) 2<=z<=4; find min( power_vertex1(x) + power_vertex2(y) + power_vertex3(z) )
   my $tt = CDFG->new();
   
   $tt->set_time_weighted_vertex('#::1',0);
   ...
   $tt->set_time_weighted_edge('<::0','#::1',1);
   ...

   # initial set
   $tt->run_time_weighted_ini();

   # ASAP flow
   $tt->run_time_weighted_test();
   $tt->run_time_weighted_ASAP();
   $tt->rst_time_weighted_ASAP();
   # dump or not
   $tt->dump_ASAP_graphviz_file('ASAP.dot');

   # ALAP flow
   $tt->run_time_weighted_test();
   $tt->run_time_weighted_max_path();
   $tt->run_time_weighted_ALAP();
   $tt->rst_time_weighted_ALAP();
   # dump or not
   $tt->dump_ALAP_grpahviz_file('ALAP.dot');

   # report
my $rpt = $tt->run_report();
   print Dumper($rpt);


project: https://github.com/funningboy/SOC_c_model/blob/master/ASALAP/main.pl