2011年9月27日 星期二

PYPY toolchain case study



-input program(python)
-output(#c,java)
-flow analysis
flow graph gen(block, start/exit/exception, condition)
block(operators,values,result->pointer map)
block(list, input, exit, def, use, alive)
-annotator
class def(SomeObject, IntegerObject, StringObject...)
class def(SomeList, SomeTuple, SomeDic...) for high level
analysis pass(map flow graph(High level) 2 low level graph)
-Rtype()
Rtype def(int,float,char...) 
low level optimization(malloc remove,func inline,stackless)
-lltype() 
lltype def(LLVM based)






2011年9月20日 星期二

2011年9月19日 星期一

Handle-C

Handle-C 一種高階的HardWare語法, 有點像是把Hardware的Driver寫成API 的語法,讓使用者能透過這些API,直接控制硬體的行為模式.不過缺點就是Platform被綁死,而系統的最佳化取決於架構的型式.
-hardware keyword(token)
type()   : unsigned, signed,
length() : int <3>, [3:0]
size()   : [10]

unsigned 4 RAM[4][4]; 4*4 4 bits RAM

 
-logic/arithmetic
&,/,^,+,-...


-interface definition


-module keyword  
RAM,ROM,BUS,...


-method/processor
pra : parallel
seq : sequence

par 
{
    { 
        a = b; 
        c = d;
        link ! x;
    }

    link ? y;
}

/*
cycle  Branch_1    Branch_2
1      a=b;        delay
2      c=d;        delay
3      channel_out channel_in
*/

-special keywords
<- Take LSBs
\\ Drop LSBs
@  Concatenation

x = 0xC7; 
y = x <- 4; // y = 0X7
z = x \\ 4; // z = 0xC
x = y @ z;  // x = 0x7C


-standard library
static extern
volatile

extern "C" int printf(const char *format, ...);


-try except(reset)
try ... reset

try {
 a = 1;
}
reset(_condition)


- RAM Bank/Bolck Check
x = y>z ? RamA[1] : RamA[2]; // fail
x = RamA[y>z ? 1 : 2];  // pass


-Functions and macros:
functions : not recursive
macros    : recursive

macro expr multiply(x,y) = select(width(x) == 0, 0
    multiply(x\\1,y<<1) +
    (x[0] == 1 ? y : 0));

a = multiply(b,c);

a = ((b\\3)[0] == 1 ? c<<3 : 0) +
    ((b\\2)[0] == 1 ? c<<2 : 0) +
    ((b\\1)[0] == 1 ? c<<1 : 0) +
    (b[0]      == 1 ? c    : 0);

a = ((b&8) == 8 ? c*8 : 0) +
    ((b&4) == 4 ? c*4 : 0) +
    ((b&2) == 2 ? c*2 : 0) +
    ((b&1) == 1 ? c   : 0);

delay // clock cycle


-Platform type
set family = XilinxVirtex;
macro expr DoThis() =
select (__isfamily(XilinxVirtex)  : DoThing1() :
select (__isfamily(AlteraApex20K) : DoThing2() :
select (__isfamily(MadeUpDevice)  : DoThing3() : DoThing4())
)
);


-Clock timing
ram unsigned 8 ExtRAM[16384] with { 
offchip = 1, westart = 2, welength = 1, 

data = {"P1", "P2"}, 
addr = {"P9", "P10"}, 
we = {"P23"}, 
oe = {"P24"}, 
cs = {"P25"}
};

-diff Clock interface
FIFO/PORT/BUS
Ref:http://babbage.cs.qc.edu/courses/cs345/Manuals/HandelC.pdf

2011年9月15日 星期四

hyper block





Spark
Parallel synthesis
http://mesl.ucsd.edu/spark/download.shtml

xPilot
Parallel synthesis
xPilot: A Platform-Based Behavioral Synthesis System

hyper Block

Mitrion SDK PE

The Mitrion SDK PE is a complete development environment for Mitrion-C applications. It includes the Mitrion-C compiler, a graphical simulator, documentation and examples. You can use the Mitrion SDK PE to develop and simulate applications for the Mitrion Virtual Processor without having access to any FPGA hardware

impulse-c
http://www.impulseaccelerated.com/
http://www.impulseaccelerated.com/Tutorials/Basic/Tutorial_Basic_HW_Gen.pdf

handle-c
 http://en.wikipedia.org/wiki/Handel-C
src carte
rctoolbox

cpsr (the current program status register)
spsr (saved program status register)
http://www.eetimes.com/discussion/cole-bin/4217014/Digging-for-gold-at-the-ESC
 



2011年9月2日 星期五

Jacquard @ Eclipse synthesis tool


features
1. IP lib support (Add, Sub ...for xilinx)

2. automatic insert (double click in GUI)

3. optimization

3.1 high level

loop unrolling, fusion, inline,  reduce , divider/multiplier elimination , array...
Ref: http://www.jacquardcomputing.com/roccc/tutorials/4-optimizations/high-level-optimizatons/

ex:
constrain  @ if (*) latency > 1 unit cycle.
method    @ operator bind (<<,(recourse)+,(recourse)-)  
ps : a*5 = (a*4) + a = (a<<2) + a 
       a*7 = (a*8) - a = (a<<3) - a

or

(32bits)a = (32bits)b * (32bits)c = (16bits)b * (16bits)c
method    @ operator width reduce (32->16)



3.2 low level
operator balance, copy reduction, fanout tree gen ...
ref: http://www.jacquardcomputing.com/roccc/tutorials/4-optimizations/low-level-optimizations/

ex:
c = sum(a,b);
...
e = sum(c,e);

@ system call (sum)2 times,
@ methods resource bind & schedule


4. external ip wrapper
 http://www.jacquardcomputing.com/roccc/tutorials/5-advanced-usage/intrinsic-usage-and-management/

 5 testbench gen

6 protocol interface support