2010年9月30日 星期四

risc cpu @ systemc

hi all, If you are interest in the CPU architecture design,i think the 'rsic cpu' is a good choice.It is a sample package in systemc example, but it contain some problems with multi-drive. What's the multi-drive, it means the multi inputs drive the same output. you can image the data in this point is unstable, it should be 1 or 0 or x.... ,so the compiler would confuse what's the data in it.Because in systemc is no wire definition, in 'sc_signal' declare it like the register or buffer design, it would keep the current data until the next trigger to change it's data. we use a very sample way to avoid it, to create a new module and change the declare of 'sc_signal' to 'sc_in' and 'sc_out' . all new packages and version is release in here please use this command to compile our package.
g++ *.cpp -I/usr/systemc/include -L/usr/systemc/lib-linux -o cpu -lsystemc
1 fix lists 1. multi drive example Error: (E115) sc_signal cannot have more than one driver: signal `STALL_FETCH' (sc_signal) first driver `PAGING_BLOCK.port_13' (sc_out) second driver `BIOS_BLOCK.port_6' (sc_out) In file: ..\..\src\sysc\communication\sc_signal.cpp:126 2. data cache miss the environment of dcache is unconnected, we replace it by mem architecture,that can be work on "lw,sw" commands 3. branch && jump conditions not work fix the program counter to jump && branch condition 2. how to use
//1 use this command to load our asm code 2 icache file
perl assembler.pl test2.asm -code > icache

//2 use this command to get visible asm code
perl assembler.pl test2.asm  > view

//3 run and check
./cpu

2010年9月29日 星期三

ALU estimator @ RTL verilog

purpose: 在 RTL design 中,除了做 function 驗證之外, 如果我們能夠事先得知 ALU 的個數,那在synthesis 跟 area estimation 上就有些許的幫助. 可以在RTL level 得到這些 rough data 做簡略的架構分析跟 optimization. method: step 1.analysis time windows @ each always block, step 2.find the max ALU numbers in our definition(Add,Sub,Mul...) target: 找出在 Design 中所需要的 ALU 個數 sample code ps:這邊不考慮 carry && over float 的判斷
  always @(posedge clk or posedge rst )begin
       if( rst )begin
            out_a <= 0;
            tmp_a <= 0;
       end
       else  begin
            if( sel_1 =='b1 ) begin
                out_a <= in_a + in_b + 2;
            end
            else begin
                if( sel_2 ==1'b1 )begin
                    out_a <= in_a - in_b + tmp_a;
                    tmp_a <= in_a>>1;
                end
                else begin
                    out_a <= in_a>>1;
                end
            end
       end
  end

  always @(posedge clk or posedge rst )begin
     if( rst )begin
          out_b <= 0;
     end
     else begin
          if( out_b < 10 ) begin
              out_b <= out_b +1;
         end
     end
Results: 可發現在 always block 1 最少需要 2 個 ADD, always block 2 最少要 1 個 ADD, 但在 always block 彼此是獨立的條件下, 所以在這個 Design 下最少需要 2+1 個 ADD. 當然其他可以以此類推. 最後可以用 prime time 合成一個 sample ADD.把 sample ADD 的 Info 建成 table, 帶入你的 Design 就可以大概的得知 Design 的 Area. ADD-> @ always block -> block level -> counts$VAR1 = { '1' => { '4' => 1, '3' => 2 }, '2' => { '3' => 1 } }; SUB-> @ always block -> block level -> counts$VAR1 = { '1' => { '4' => 1 }, '2' => {} }; MUL-> @ always block -> block level -> counts$VAR1 = { '1' => {}, '2' => {} }; tool requirements: perl verilog package code download here. Refs: ESL Design Flow, peak power, NetWork on Chip @c,

2010年9月27日 星期一

encoder 小技巧....

有沒有平常被一堆的判斷式搞到昏頭轉象呢,在Verilog coding 時又要考慮到在 always block 內部的判斷是否會有 latch 產生,最後在codelink時候又擔心coverage不足夠呢... 其實簡單的想把判斷是全部展開成個encoder table,內部存著Hash key->vale 就ok 拉.不僅可以發現是否有判斷條件少了,且增加閱讀的效率... original part
          if( in_a ==1 ){ 
              out = in+1;
          } else {
              if( in_b == 0 ){
                  out = in-1;
              } else {
                  if( in_c == 1) {
                       out = in*2;
                 }else {
                       out = in;
                 }
             }
          }
encoder part
//range 0 ~ 7 
s = in_c << 2 | in_b << 1 | in_a;

switch(s){
  case  0 : out = in-1; break; // in_c(0), in_b(0), in_a(0)
  case  1 : out = in+1; break; // in_c(0), in_b(0), in_a(1)
  case  2 : out = in;   break; // in_c(0), in_b(1), in_a(0)
  case  3 : out = in+1; break; // in_c(0), in_b(1), in_a(1)
  case  4 : out = in-1; break; // in_c(1), in_b(0), in_a(0)
  case  5 : out = in+1; break; // in_c(1), in_b(0), in_a(1)
  case  6 : out = in*2; break; // in_c(1), in_b(1), in_a(0)
  case  7 : out = in+1; break; // in_c(1), in_b(1), in_a(1)
 }

OpenEmbedded case study

BitBake 為OpenEmbedded 的核心,主要解決 cross-compile && configure 的問題.因為在 embedded 系統下,光是 Environment build 就要花一斷時間,又要考慮到porting 的問題更是雪上加霜, 所以藉由 BitBake 來做到 svn, git, svk + compile + configure, 透過版本的 syn 跟 local build 達到快速客至化的動作. 可參考底下的連結,有比較詳細的說明. pic ref OpenEmbedded and BitBake Refs: Welcome to OpenEmbedded 轉換OpenEmbedded的repository為Subversion系統 SVK與嵌入式系統開發 OpenEmbedded First Try SVK 使用雜記與隨想 OpenEmbedded and BitBake

2010年9月26日 星期日

Curt @Jserv case study

難得有機會可以拜讀大師(Jserv)的作品"Curt",這是網路上有名的自由軟體作者Jserv所做的小型OS系統,系統雖小但五藏俱全,提供了sp, stat, help 的 threads,跟Schedule 的機制,系統流程主要可分成兩大部份,Step 1.boot-loader, 把image load 到RAM的low level assembly code, Step 2. start OS && schedule handle. 詳細流程可參考"國立台灣師範大學資訊工程系"的"嵌入式系統"教材. 底下小弟只是把所知到的data整理成flow表,如需詳細說明部份在Curt的Download檔裡也有說明. Step1. *initial set
*flush TLB/ICahe/DCahe
*Interrupt disable *clock set *UART Hardware 2 Console Display set *all enable && jump to RAM main() @0xc0000000 這邊也可參考類似的flow @Uboot . Step2. *OS start *initial ready_lists 4 each threads, delayed_list,termination_wait_list,termination_wait_list *set thread 2 thread table && priority/function set * run thread && schedule detection (SCHED_TIME_EXPIRE||SCHED_THREAD_REQUEST)
        //儲存CPU status && Disable Interrupt
        cpu_sr = save_cpu_sr();

        //取得最高priority 的 thread  Id
        top_prio = get_top_prio(); 

        /* If the timer expires... */
        if (sched_type == SCHED_TIME_EXPIRE) {
                /* Threads that are currently running will continue to run it
                 * at the highest priority */

                if (current_thread->prio < top_prio) {
                        current_thread->time_quantum = TIME_QUANTUM;
                        restore_cpu_sr(cpu_sr);
                        return;
                }
                /* otherwise, threads in a ready state, then run the highest
                 * priority one. */

                //取得 ready_list的 first Node 2 entry list && change context switch mode
                pnode = delete_front_list(&ready_list[top_prio]);
                if (is_empty_list(&ready_list[top_prio]))
                        prio_exist_flag[top_prio] = false;
                next_thread = entry_list(pnode, thread_struct, node);
                next_thread->state = RUNNING;
                next_thread->time_quantum = TIME_QUANTUM;

                /* Ready to change the status of the currently executing thread */
                current_thread->state = READY;
                insert_back_list(&ready_list[current_thread->prio],
                                 ¤t_thread->node);
                prio_exist_flag[current_thread->prio] = true;
                total_csw_cnt++;
                /* actual context switching */
                context_switch_in_interrupt();
        }
code ref: Jserv's 解析 CuRT 與嵌入式系統設計 放大參考here ps: 如有錯誤的地方歡迎指正,謝謝

2010年9月24日 星期五

x.org case study

dinotrace @ waveform viewer提到 GUI Interface 的實現方式,原本以為只有 widget(GUI layout) + event(process trigger) ...,但其實 x.org(X11) 在terminal 的表現上比較像 Server/Client 的概念. 可參考底下的說明...XD,真的是隔行如隔山,讀起來被感艱辛阿...各位看官就當看故事書吧. Xorg 嶄新的硬體加速與效能提昇機制 Xorg 嶄新的硬體加速與效能提昇機制(续) flow chart. 請參考名詞解說 Refs: The Xlib Manual Xlib API wm-spec http://www.opengl.org/

2010年9月20日 星期一

Mouse emulator @ Linux

底下是 In-Air Mouse and Joystick 的 Demo,利用 Remote control 來模擬真實 mouse 的情形.想說不如來找找有沒有source code 可以研究.沒想到居然還有個叫 Keymouse 的 project,喝喝...看來google 真是太神了. 1.先用 modprobe 掛載 devices到 /dev/input上,
% sudo modprobe uinput
2. sample code ref here 確定uinput掛載好後,用ioctl()來控制IO2的讀寫.最後用linux/input.h 中的 input_event來模擬我們 input的vector.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/input.h>
#include <linux/uinput.h>
#include <sys/time.h>
#include <unistd.h>

// Test the uinput module

struct uinput_user_dev   uinp;
struct input_event       event;

int main(void) {

            int ufile, retcode, i;

            ufile = open("/dev/input/uinput", O_WRONLY | O_NDELAY );
            printf("open /dev/input/uinput returned %d.\n", ufile);
            if (ufile == 0) {
                        printf("Could not open uinput.\n");
                        return -1;
            }

            memset(&uinp, 0, sizeof(uinp));
            strncpy(uinp.name, "simulated mouse", 20);
            uinp.id.version = 4;
            uinp.id.bustype = BUS_USB;

            ioctl(ufile, UI_SET_EVBIT, EV_KEY);
            ioctl(ufile, UI_SET_EVBIT, EV_REL);
            ioctl(ufile, UI_SET_RELBIT, REL_X);
            ioctl(ufile, UI_SET_RELBIT, REL_Y);

            for (i=0; i<256; i++) {
                        ioctl(ufile, UI_SET_KEYBIT, i);
            }

            ioctl(ufile, UI_SET_KEYBIT, BTN_MOUSE);

            // create input device in input subsystem
            retcode = write(ufile, &uinp, sizeof(uinp));
            printf("First write returned %d.\n", retcode);

            retcode = (ioctl(ufile, UI_DEV_CREATE));
            printf("ioctl UI_DEV_CREATE returned %d.\n", retcode);
            if (retcode) {
                        printf("Error create uinput device %d.\n", retcode);
                        return -1;
            }

            // NOW DO STUFF !!!!

            for (i=0; i<100; i++) {

                        struct timeval tv1;

                        // move pointer upleft by 5 pixels
                        memset(&event, 0, sizeof(event));
                        gettimeofday(&event.time, NULL);
                        event.type = EV_REL;
                        event.code = REL_X;
                        event.value = -5;
                        write(ufile, &event, sizeof(event));
            
                        memset(&event, 0, sizeof(event));
                        gettimeofday(&event.time, NULL);
                        event.type = EV_REL;
                        event.code = REL_Y;
                        event.value = -5;
                        write(ufile, &event, sizeof(event));
            
                        memset(&event, 0, sizeof(event));
                        gettimeofday(&event.time, NULL);
                        event.type = EV_SYN;
                        event.code = SYN_REPORT;
                        event.value = 0;
                        write(ufile, &event, sizeof(event));

                        // wait just a moment
                        do { gettimeofday(&tv1, NULL); } while ((tv1.tv_usec & 
0x3FFF) != 0);
                        do { gettimeofday(&tv1, NULL); } while ((tv1.tv_usec & 
0x3FFF) == 0);
            }

            // destroy the device
            ioctl(ufile, UI_DEV_DESTROY);

            close(ufile);

}
ps: Keymouse 也是用此原理來做的, 你可以在 device.cpp 發現相似的地方. Refs: Sending simulated mouse events using uinput Mouseemu / uinput Linux Cross Reference Linux 驅動程式的 I/O, #4: fops->ioctl 實作

2010年9月19日 星期日

dinotrace @ waveform viewer

怎麼 Dinotrace 的 widgets 是如此的醜陋,如果你習慣了Debussy/Verdi 的介面,建議還是用 GTKwave 比較好看的多...不過基於研究的出發點,底下我們就針對幾個topics來討論. 1.VCD format:
//定義 create date or modify date
$date
        Mon Jun 15 17:13:54 1998
$end

//定義 VCD Version
$version
        Chronologic Simulation VCS version 4.0.3
$end

//定義 timescale (時間的刻度)
$timescale
        1ns
$end

//定義 Port range && key map
//ex: $var reg       1 !    clk  $end
//代表 signal name (clk), range 1 bit, key value "!"

$scope module benchx $end
$var reg       1 !    clk  $end
$var reg      11 "    count [10:0] $end
$var reg       1 #    toggle_01  $end
$upscope $end

$enddefinitions $end
#0
$dumpvars
x!                            //clk      =x @0 
x#                            //toggle_01=x @0
bxxxxxxxxxxx "                //count    =bxxxxxxxxxxx @0
$end
#5
0!                            //clk      =0 @5 
#10
1!                            //clk      =1 @10
z#
#15
0!
#20
1!
#25
0!
b00000000000 "
0#

//ps: 如果沒有定義,就接續上個定義的data. 
2. widgets : 這邊用 GTK 當sample...
 GtkWidget *window;
 window = gtk_window_new (GTK_WINDOW_TOPLEVEL);       //建立window
 gtk_widget_set_usize (GTK_WIDGET (window), 200, 100);//設定大小
 gtk_window_set_title (GTK_WINDOW (window), "GTK Menu Test");//name
 gtk_signal_connect (GTK_OBJECT (window), "delete_event",
                        (GtkSignalFunc) gtk_main_quit, NULL); //關閉鈕
 ....
 gtk_widget_show (window); //show window
Menu Widget 了解VCD file format and Widgets 之後,其實剩下的就只是GUI interface的建立.如button + event + menu + widgets + VCD file parser 的結合. Refs: VCD file format Waveform_viewer verdi tutorial SpringSoft

2010年9月18日 星期六

3D graphic @ Mesa case study

最近剛開始研究圖形介面,發現 Mesa 有提供Hw 加速的功能,可透過 Mesa 的 API 直接控制硬體來減少Sw 的複雜度. 底下找了一些相關的data,有興趣的人可以看看 Refs: x.org case study Linux 的3D加速--DRI Mesa Mesa 3D OpenGL [PDF] OpenGL Programming Guide

2010年9月17日 星期五

X window protocol

因為工作的關係,要從PC連到workstation,但又必須透過x window protocol把 desktop Display 丟回來.雖然有 X.org 的 x.server x.client, 但功能還是不完整,於是就找到了 Xephyr 這個tool.可透過底下的script來完成我們要的jobs.當然你也可以用 Xephyr 提供的 multi desktops Display,來做多個視窗管理.
Xephyr -kb -query xxx.xxx.xxx.xxx -fullscreen -ac :10
Refs: X Window System core protocol Xephyr Multiseat Configuration/Xephyr

Open source logic analyzer software

這是在國外的網站上看到的Topic @ Hack a Day, 有興趣的人可以去看看國外的人是多麼的瘋狂.看到都自己慚愧起來,...XD. Ref:Logic Analyzer

2010年9月16日 星期四

DFG @ scheduling Algorithm..

在High level synthesis flow 中, Scheduling Algorithm 扮演了很重要的角色, 這決定了HW Architecture 跟 Control Flow...而這幕後的功臣要取決於良好的Data flow definition, 底下介紹最常聽到的Design flow @ Graph theory (DFG). 在之前的 research 中,我們提到幾個high level 的synthesis tools. 如 Behavior Synthesizer tool @sister , Design Automation Tool from Behavior Level to Transaction Level for Virtual Bus-Based Platforms , 中都採用 DFG(data flow graph) 當 Data Based. 好處是比較容易understand跟implement. 底下就用個sample來講解DFG的架構. 假設底下的macro為一個 SystemC 的 Process 且 trigger type 跟 sensitive type 都已知.首先透過 code analysis 的方式來分析語法跟 keywords && tokens.這邊可以考慮用 lex && yacc 當 filter 跟 structure set. 分析語法後根據每個section依序建立起各自的Graph.因為 DFG 是以 control 來切的,所以遇到每個condition判斷時就會切成一組新的Graph.而每組Graph 中又細分出子空間的 DFG, ctrl Graph. 最後用linklist 把每張 Graph 的前後關係建起個 link list 成為 Top Graph. 在最底成的 DFG Graph 大致長的像這樣,裡面會 NxtNode, PreNode 跟 OperType 的 Definition. 最後 Top Graph 大概長如此. 一層Graph list 底下又有一層 Graph list. 因為我們是ACycle Design for Verilog,所以在最TOP層的 Section(s) 是 Cycle by Cycle的.你可以想成彼此是互相獨立的,只有在下個Clock edge 打起來後Data 才會發生變化.在此前提下我們又可以對每個 Section 做 constrain 跟 Scheduling 的動作. 如 HW Operator Number constrains, or Operator Delay set, or ASAP/ALAP scheduling... Refs: Scheduling Algorithms For High-Level Synthesis 15 References [1 ...[PDF] AEON: Synthesizing Scheduling Algorithms from High-Level Models[PDF] Scheduling (computing)

2010年9月13日 星期一

Verilog to c++/ SystemC @ Verilator...

在 simulation time 上,完全取決於 Machine 跟 code level 的複雜度. 對於 Verilog vs C++/SystemC 而言. 在 simulation time 表現上 C++/SystemC 小於 Verilog, 畢竟 C++/SystemC 可透過 compile(GNU) 做最佳化來符合 Machine 的架構. 當然從驗證的角度來看,速度跟驗證的精確度是我們所要考慮的因素,所以 Verilator 提出了 Low Level to High Level 的simulation tool.在 Design 早期就可透過這種方式來做到 Design and test co-simulation, 加速整個Time to Market 的 Flow. Refs: verilator Verilog2C++ High Performance SoC Modeling with Verilator

c to Verilog .....

其實說穿了只是 C to Verilog 語法上的轉換再加入一些 Stages 跟 signal 的機制. 畢竟還是用 DFG(data flow graph) 的架構,在 pointer 跟 Memory malloc 上還是支援不足. 在control edge 跟 for loop上還是慣用 expand 的方式展開, 再加上一些 Operator/Pipe-line constrain, 雖然說這樣是很直覺的想法沒錯, 我在 c code 上定義了那麼多的 Operator 跟我想要的 Architecture (FPGA support). 但實際在 SOC上的設計卻不是如此,通常是要去 access 內部的cache 抓 Data 到 Register 之後再做運算. 而不是全部展開成 Register 的方式.這樣會造成 hardware cost 過大,且這些Register 不能被 Share.且光個 example code 才 15 行,轉出的 code 卻是 800 多行...,這也是為什麼現今來沒有套EDA tool公司能 Release 出最好的解決辦法. 因為10 成中有8成跟 coding style有關. 就像 garbage in = garbage out...一樣.但要讓個 high level designer 又要考慮那麼多東西.就有點倒退嚕的感覺.所以現在主流還是IP design.用現有的software/hardware IP 做 co-design.畢竟 coding 大家都會. Architecture 才是真正的核心價值. Ref c to Verilog

execve @ Linux Kernel

execve 有點像 include(jump)到 external 的exe(執行檔),等exe執行完後在回到 Internal Process. 到雖然一般都用 *.tcl, *.sh 來連接 exe 比較方便.比較不用 care 到 I/O 的 Interface.既然 Linux 有提供這樣的方式,姑且就來看看摟... step 1. 先產生個 sample.exe % gcc -o sample sample.c ps: ./sample 需要代入參數 ex: ./sample Hello
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char* argv[]){

 if(argc!=2){
  printf("usage %s <string> \n",argv[0]);
  exit(EXIT_FAILURE);
 }
  printf("%s\n",argv[1]);
  exit(EXIT_SUCCESS); 
}
step 2. 用 execve 來執行 sample.exe 注意 ./sample 是需要代入參數 , 如 "./sample Hello" 所以帶入 execve() 時的 newargv 參數會長 "./sample Hello", 所以在Array define 中必須要用NULL 隔開.
#include <stdio.h>
#include <stdlib.h>


int main(int argc,char *argv[]){
   char *newargv[] = { NULL, "Hello", NULL };
   char *newenviron[] = { NULL };

 if (argc != 2) {
                fprintf(stderr, "Usage: %s <file-to-exec>\n", argv[0]);
                exit(EXIT_FAILURE);
               }

               newargv[0] = argv[1];

               execve(argv[1], newargv, newenviron);
               perror("execve");   /* execve() only returns on error */
               exit(EXIT_FAILURE);

}
Refs: Linux Programmer's Manual EXECVE(2) Execve problem with invoking execve problem solaris

ptrace @ Linux Kernel

在 Parent Process assign to Child Process 的方法有很多種,如 fork, vfork ,clone...但這都還是在比較高層的 assign 方式, 其實 trace code 後,不難發現都是透過 ptrace 這個 func 來做底層的 register assign, 以 x86 為例. 你會看到 eax, ecx, esp, ebp, 的 assign 方式. 有點像是 assembly encode/decode 的方法...,當然我們也可以用 ptrace 的方式來 trace system calls.
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/reg.h>
#include <sys/syscall.h>   /* For SYS_write etc */

#include <stdio.h>
#include <stdlib.h>

int main()
{   pid_t child;
    long orig_eax, eax;
    long params[3];
    int status;
    int insyscall = 0;
    child = fork();
    if(child == 0) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execl("/bin/ls", "ls", NULL);
    }
    else {
       while(1) {
          wait(&status);
          if(WIFEXITED(status))
              break;
          orig_eax = ptrace(PTRACE_PEEKUSER,
                     child, 4 * ORIG_EAX, NULL);
          if(orig_eax == SYS_write) {
             if(insyscall == 0) {
                /* Syscall entry */
                insyscall = 1;
                params[0] = ptrace(PTRACE_PEEKUSER,
                                   child, 4 * EBX,
                                   NULL);
                params[1] = ptrace(PTRACE_PEEKUSER,
                                   child, 4 * ECX,
                                   NULL);
                params[2] = ptrace(PTRACE_PEEKUSER,
                                   child, 4 * EDX,
                                   NULL);
                printf("Write called with "
                       "%ld, %ld, %ld\n",
                       params[0], params[1],
                       params[2]);
                }
          else { /* Syscall exit */
                eax = ptrace(PTRACE_PEEKUSER,
                             child, 4 * EAX, NULL);
                    printf("Write returned "
                           "with %ld\n", eax);
                    insyscall = 0;
                }
            }
            ptrace(PTRACE_SYSCALL,
                   child, NULL, NULL);
        }
    }
    return 0;
}

code ref : Playing with ptrace ps : 當然你也可以用 strace 來trace 執行時所呼叫的system call. Refs: ptrace Playing with ptrace

pipe @ Linux Kernel

pipe 為一種基於 Parent && Child 之間的 communication Channel, 有點類似 FIFO 的架構, 但跟FIFO 還是有點不同.pipe會先建立起個緩衝區,分別做讀完(pipefd[0]),寫完(pipefd[1])緩衝區的動作. 底下透過 Parent 建立起 Child Process(pipe Read), 等 Child End 之後在還給 Parent 做(pipe write).
#include >sys/wait.h>
#include >stdio.h>
#include >stdlib.h>
#include >unistd.h>
#include >string.h>

int
main(int argc, char *argv[])
{
    int pipefd[2];
    pid_t cpid;
    char buf;

    if (argc != 2) {
     fprintf(stderr, "Usage: %s >string>\n", argv[0]);
     exit(EXIT_FAILURE);
    }

    if (pipe(pipefd) == -1) { // create pipe structure...
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    cpid = fork();  // create child proc
    if (cpid == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    }

   if (cpid == 0) {      /* Child reads from pipe */
       printf("Child...\n");
  close(pipefd[1]); /* Close unused write end */
    // disable write func
        while (read(pipefd[0], &buf, 1) > 0) // read from argv[] one chart by one chart
            write(STDOUT_FILENO, &buf, 1); // write 2 STDOUT

        write(STDOUT_FILENO, "\n", 1);
        close(pipefd[0]);   // disable read func
        _exit(EXIT_SUCCESS);   // child proc end

    } else {                /* Parent writes argv[1] to pipe */
 printf("Parent...\n");
        close(pipefd[0]);          /* Close unused read end */
        write(pipefd[1], argv[1], strlen(argv[1]));
        close(pipefd[1]);          /* Reader will see EOF */
        wait(NULL);                /* Wait for child */
        exit(EXIT_SUCCESS);
    }
}
code ref: Linux Programmer's Manual Refs : Linux Programmer's Manual PIPE(2) pipe(7) - Linux man page Executing programs with C(Linux)

2010年9月12日 星期日

Qtstalker @ finance tool

Qtstalker 為 Open Source 的 project, 有點像TS(trade-station)跟HTS(日盛) 的分析軟體, 可以插入交易策略跟自訂的Indicator. 沒想到我還笨笨的自己在那邊刻.....XD,不過這樣也好. 可以熟知每個交系訊號的觸發機制跟字己進出場的原則. 可參考我過去做的一些理論跟原則 Ref Finance Lists Refs 盤後分析軟體 - qtstalker

特色

  • 支援一打以上的技術指標,像是平滑異同曲線(MACD)、相對強弱指標(RSI)、布林格交易波帶 (MA's Bollinger Bankds)等
  • 六種 K 線圖表現圖形, line, bar, candlestick, point and fingure, paint bars 與 swing
  • 蹩腳的有價証券管理機制。但適合追蹤股價
  • 支援多種市場資訊來源,可以從 Yahoo, CME, NYBOT 中取得股市市價
  • 技術線圖中可以加上買賣指標記號、文字、直線、橫線與斐波納契折線(Fibonacci Retracement)
  • 圖形顯示模式可為日、週與月
  • 投資模式可供 stock, futures, index 與 spreads.
  • 三種圖形縮放模式像是,縮放至螢幕大小,所有資料序列與 log。
  • 可模組化股價與指標,提供未來應用彈性
  • 回測功能,可用實際交易資料來測試指標效能
  • 支援獲利計算功能,可加入特定股票計算買入市值到售出之獲利或損失
Refs 盤後分析軟體 - qtstalker

2010年9月11日 星期六

fork && clone @ Linux Kernel

除了用 fork 的方式來實現多線程的方式外,也可以用 clone 的方式.兩者最大的差別是在 Parent copy or not, fork 會copy Parent 的 space, 而 clone 卻是和 Parent 共用 space, 前者比較多用於對外的多平行執行序如external server/client,後者比較多用於 internal的 Process 如 mm management. fork_smp.c 可發現 Parent 的 data 不會被 Child 改變.
#include <stdio.h>
#include <stdlib.h>

int main(int argc,char *argv[]) {
        int count = 1;
        int child;

        if(!(child = vfork())) {
                printf("This is son, his count is: %d. and his pid is: %d\n", ++count, getpid());
        } else {
                printf("This is father, his count is: %d, his pid is: %d\n", count, getpid());
        }
return 0;
}
clone.c 發現 Parent 的 Data 會被 Child 而改變.
#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <signal.h>
#define FIBER_STACK 8192

int a;
void * stack;
int do_something(){
        printf("This is son, the pid is:%d, the a is: %d\n", getpid(), ++a);
        free(stack);
        exit(1);
}
int main() {
        void * stack;
        a = 1;
        stack = malloc(FIBER_STACK);
        if(!stack) {
                printf("The stack failed\n");
                exit(0);
        }

        printf("creating son thread!!!\n");

        clone(&do_something, (char *)stack + FIBER_STACK, CLONE_VM|CLONE_VFORK, 0);
         printf("This is father, my pid is: %d, the a is: %d\n", getpid(), a);
         exit(1);
}
Refs: fork,vfork和clone底层实现 fork, vfork, clone,pthread_create,kernel_thread clone fork及vfork的区别

2010年9月8日 星期三

disassemble your code @ objdump ....

除了用 gcc -S 來產生 *.s 的 assembly code之外. 也可以用 objdump + strace 的 command 來做 assemble && system call trace. sample code @ c
TT *get_TTNode(TT *p,int Id){
    TT *tPtr = p;

    while(tPtr!=NULL){
      if( tPtr->Id == Id ){
           return tPtr; break;
      }
      tPtr = tPtr->Nxt;
   }

return NULL;
}
objdump -d (disassemble)
080484d3 <get_TTNode>:
 80484d3:       55                      push   %ebp
 80484d4:       89 e5                   mov    %esp,%ebp                        // store current stack pointer
 80484d6:       83 ec 14                sub    $0x14,%esp
 80484d9:       8b 45 08                mov    0x8(%ebp),%eax                   // get "TT *p"
 80484dc:       89 45 fc                mov    %eax,-0x4(%ebp)                  // "TT *tPtr = p"
 80484df:       eb 1b                   jmp    80484fc <get_TTNode+0x29>  // jump to while loop 
 80484e1:       8b 45 fc                mov    -0x4(%ebp),%eax                  // "tPtr->Id"
 80484e4:       8b 00                   mov    (%eax),%eax                      // mov to eax register
 80484e6:       3b 45 0c                cmp    0xc(%ebp),%eax                   // cmp (tPtr->Id == Id )? true : false;
 80484e9:       75 08                   jne    80484f3 <get_TTNode+0x20>
 80484eb:       8b 45 fc                mov    -0x4(%ebp),%eax                  
 80484ee:       89 45 ec                mov    %eax,-0x14(%ebp)                 // return tPtr
 80484f1:       eb 16                   jmp    8048509 <get_TTNode+0x36>  // break
 80484f3:       8b 45 fc                mov    -0x4(%ebp),%eax
 80484f6:       8b 40 0c                mov    0xc(%eax),%eax                   // tPtr = tPtr->Nxt
 80484f9:       89 45 fc                mov    %eax,-0x4(%ebp)
 80484fc:       83 7d fc 00             cmpl   $0x0,-0x4(%ebp)                  // while loop
 8048500:       75 df                   jne    80484e1 <get_TTNode+0xe>           // cmp (tPtr!=NULL)? true : false;
 8048502:       c7 45 ec 00 00 00 00    movl   $0x0,-0x14(%ebp)
 8048509:       8b 45 ec                mov    -0x14(%ebp),%eax                 //return NULL
 804850c:       c9                      leave  
 804850d:       c3                      ret 
Refs : objdump strace

2010年9月7日 星期二

LEX && YACC sample case pt2

接續 LEX && YACC sample case pt1 幫 List-Node 穿上 Lex && Yacc 的衣服....^_^.再依序填入List中. sample.l
%{
#include "y.tab.h"
%}

%%
([0-9]+|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) {
  yylval.dval = atof(yytext);
  return NUMBER;
 }

[ \t] ;   /* ignore white space */

[A-Za-z][A-Za-z0-9]* { /* return symbol pointer */
  return NAME;
 }

"$" { return 0; }

\n |
. return yytext[0];
%%
int yywrap()
{
    return 1;
}
sample.y
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
#include "tt.h"
int OpId =0;
struct LIST   *lstr; 
%}

%union {
 double dval;
}
%token <dval> NAME
%token <dval> NUMBER
%left '-' '+'
%left '*' '/'
%nonassoc UMINUS
%type <dval> expression
%%
statement_list: statement '\n'
 | statement_list statement '\n'
 ;

statement: NAME '=' expression { $1 = $3; }
 | expression  { printf("= %g\n", $1);
  displayOp();
  OpId =0; 
  lstr =NULL;
 }
 ;

expression: expression '+' expression { $$ = $1 + $3; 
    LIST *tlstr = (void *)lstr; //LIST list
    tlstr = set_List(tlstr,OpId,TT_ADD,$1,$3,$$); OpId++;
    lstr = (void *)tlstr;   
  }
 | expression '-' expression { $$ = $1 - $3; 
    LIST *tlstr = (void *)lstr; //LIST list
    tlstr = set_List(tlstr,OpId,TT_SUB,$1,$3,$$); OpId++;
    lstr = (void *)tlstr;   
  }
 | expression '*' expression { $$ = $1 * $3;
     LIST *tlstr = (void *)lstr; //LIST list
    tlstr = set_List(tlstr,OpId,TT_MUL,$1,$3,$$); OpId++;
    lstr = (void *)tlstr;   
  }
 | expression '/' expression
    { if($3 == 0.0)
      yyerror("divide by zero");
     else{
      $$ = $1 / $3;
     LIST *tlstr = (void *)lstr; //LIST list
    tlstr = set_List(tlstr,OpId,TT_DIV,$1,$3,$$); OpId++;
    lstr = (void *)tlstr;   
     }
    }
 | '-' expression %prec UMINUS { $$ = -$2; }
 | '(' expression ')' { $$ = $2; }
 | NUMBER
 | NAME   { $$ = $1; }
 ;
%%

int displayOp(){
LIST *tlstr = (void *)lstr;
TT   *tPtr = NULL; //TT   point
LIST *lPtr = NULL; //LIST point

int i=0;
char* Opst;
for(i=0; i<OpId; i++){

 if(tlstr == NULL ){ printf("<E1> Initial List Error ...\n"); return -1; }
 lPtr = get_List(tlstr,i);

 switch(lPtr->Typ){
   case TT_ADD : printf("Op(%d) -> ADD\n",i);  break;
   case TT_SUB : printf("Op(%d) -> SUB\n",i);  break;
   case TT_MUL : printf("Op(%d) -> MUL\n",i);  break;
   case TT_DIV : printf("Op(%d) -> DIV\n",i);  break;
   case TT_MOD : printf("Op(%d) -> MOD\n",i);  break;
 default: return -1; break;
        }

 printf("SRC1...\n");
 if(lPtr == NULL ){ printf("<E2> Get Ptr Error ...\n"); return -1; }
 tPtr = (void *)lPtr->Parent;
 if(tPtr == NULL ){ printf("<E3> Get Ptr->Parent Error ...\n"); return -1; }
 tPtr = get_TTNode(tPtr,TT_SRC1);
 if(tPtr == NULL ){ printf("<E4> Get Ptr->Parent->Src1 Error ...\n"); return -1; }
 display_TTNode(tPtr);

 printf("SRC2...\n");
 if(lPtr == NULL ){ printf("<E2> Get Ptr Error ...\n"); return -1; }
 tPtr = (void *)lPtr->Parent;
 if(tPtr == NULL ){ printf("<E3> Get Ptr->Parent Error ...\n"); return -1; }
 tPtr = get_TTNode(tPtr,TT_SRC2);
 if(tPtr == NULL ){ printf("<E4> Get Ptr->Parent->Src2 Error ...\n"); return -1; }
 display_TTNode(tPtr);

 printf("DST...\n");
 if(lPtr == NULL ){ printf("<E2> Get Ptr Error ...\n"); return -1; }
 tPtr = (void *)lPtr->Child;
 if(tPtr == NULL ){ printf("<E3> Get Ptr->Child Error ...\n"); return -1; }
 tPtr = get_TTNode(tPtr,TT_DST);
 if(tPtr == NULL ){ printf("<E4> Get Ptr->Child->DST Error ...\n"); return -1; }
 display_TTNode(tPtr);
 }

 printf("=====================\n");
 printf("\n");
 printf("\n");
}


int yyerror(char const *str)
{
    extern char *yytext;
    fprintf(stderr, "parser error near %s\n", yytext);
    return 0;
}

int main(void)
{
    extern int yyparse(void);
    extern FILE *yyin;

    yyin = stdin;
    if (yyparse()) {
        fprintf(stderr, "Error ! Error ! Error !\n");
        exit(1);
    }
}

Results: (1+2)*2 = 6 Op(0) -> ADD SRC1... Id :: 5,Nm :: 1.000000 SRC2... Id :: 6,Nm :: 2.000000 DST... Id :: 7,Nm :: 3.000000 Op(1) -> MUL SRC1... Id :: 5,Nm :: 3.000000 SRC2... Id :: 6,Nm :: 2.000000 DST... Id :: 7,Nm :: 6.000000 ===================== 1+2*2 = 5 Op(0) -> MUL SRC1... Id :: 5,Nm :: 2.000000 SRC2... Id :: 6,Nm :: 2.000000 DST... Id :: 7,Nm :: 4.000000 Op(1) -> ADD SRC1... Id :: 5,Nm :: 1.000000 SRC2... Id :: 6,Nm :: 4.000000 DST... Id :: 7,Nm :: 5.000000 ===================== project download here Refs lex&yacc 第一章例5 lex & yacc, 2nd Edition lex_and_yacc_chinese_version

2010年9月6日 星期一

LEX && YACC sample case pt1

早在 Lex & Yacc case study @ PLY 有提到 Lex && Yacc 的用法, 這邊用主要是透過 link-list 的方式把 Yacc 所建立的 Token 轉成 Node list 的方式存入,之後可以方便我們在內部做Scheduling 和 Mapping 的動作. tt.h

#ifndef TT_H
#define TT_H
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>

enum TTType {
 TT_ADD =0,
 TT_SUB =1,
 TT_MUL =2,
 TT_DIV =3,
 TT_MOD =4,
 TT_SRC1 =5,
 TT_SRC2 =6,
 TT_DST =7,
};

typedef struct tt {
 int Id;
 char*  Nm;
 struct tt *Nxt;
} TT;

typedef struct list {
 int    Typ;
 int    OpId;
 struct TT *Parent;
 struct TT *Child;
 struct list *Nxt;
} LIST;


TT *set_TTNode(TT *p,int Id,char* Nm){

    TT *tPtr =  malloc(sizeof(TT));
    if( tPtr == NULL )
         return NULL;
   
     tPtr->Id = Id;
     tPtr->Nm = Nm;
     tPtr->Nxt = NULL;
     
    if( p == NULL ){
        p = tPtr;
    } else {
        tPtr->Nxt = p;
        p      = tPtr;
   }
 
return p;
}

TT *display_TTNode(TT *p){
   if(p!=NULL){
     printf("Id :: %3d,",p->Id);
     printf("Nm "" %3s\n",p->Nm);
   }
return p; 
}

TT *get_TTNode(TT *p,int Id){
    TT *tPtr = p;

    while(tPtr!=NULL){
      if( tPtr->Id == Id ){
           return tPtr; break;
      }
      tPtr = tPtr->Nxt;
   }

return NULL; 
}

LIST* set_Parent2List(LIST *l,int Id,char *Nm){
      
     if(l==NULL)
        return NULL;

     TT *tPtr  = (void *) l->Parent;
     tPtr      = set_TTNode(tPtr,Id,Nm);
     l->Parent = (void *)tPtr;

return l; 
}

LIST* set_List(LIST *l,int OpId,int Tp,char *src1,char *src2,char *dst){

    LIST *lPtr =  malloc(sizeof(LIST));
    if( lPtr == NULL )
         return NULL;
    
     lPtr      = set_Parent2List(lPtr,TT_SRC1,src1); 
     lPtr      = set_Parent2List(lPtr,TT_SRC2,src2); 
     lPtr->Typ = Tp;
     lPtr->OpId= OpId;
     lPtr->Nxt = NULL;

     if(l==NULL){
        l = lPtr;
     }else {
        lPtr->Nxt = l;  
        l         = lPtr;
 
    }

return l;
}

LIST* get_List(LIST *l,int OpId){

  LIST *lPtr = l;

  while(lPtr!=NULL){
        if(lPtr->OpId == OpId){ return lPtr; break; }
        lPtr = lPtr->Nxt;
  }

}
#endif
tt.c

#include "tt.h"

int main(){

int OpId =0;

LIST *lstr = NULL; //LIST list
TT   *tPtr = NULL; //TT   point
LIST *lPtr = NULL; //LIST point

lstr = set_List(lstr,OpId,TT_ADD,"a","b","c"); OpId++;
lstr = set_List(lstr,OpId,TT_ADD,"c","d","e"); OpId++;

if(lstr == NULL ){ printf("<E1> Initial List Error ...\n"); return -1; }
lPtr = get_List(lstr,0);

if(lPtr == NULL ){ printf("<E2> Get Ptr Error ...\n"); return -1; }
tPtr = (void *)lPtr->Parent;

if(tPtr == NULL ){ printf("<E3> Get Ptr->Parent Error ...\n"); return -1; }
tPtr = get_TTNode(tPtr,TT_SRC1);

if(tPtr == NULL ){ printf("<E4> Get Ptr->Parent->Src1 Error ...\n"); return -1; }
display_TTNode(tPtr);

return 0;
}
Results: Id :: 5,Nm a Refs: pointer 2 pointer memory map tPtr = (void *)lPtr->Parent;

2010年9月4日 星期六

SMP @ Linux Kernel case study

在多核心架構下,除了可以用 sched.h 底下的 CPU_SET 來指定 schedule list上的 schedule processor 要給那個CPU使用外.還要考慮CPU lock/schedule 的機制.一但多個CPU Access 相同的 Memory Address時,就需要 lock priority 來確保 Memory Address 不會被 overwrite. 在Linux kernel 上提供了 spinlock_trwlock(read write lock)的機制, 但在效能上 rwlock 要等lock被解鎖後才能動作,所以在 kernel 2.6 下加入了 RCU 的機制, 透過 COPY 的方式建立新的pointer, 不需要等lock解鎖就可執行,等執行完後在update 之前的pointer, 所以再效能上可降低 wait unlock 的時間. sample code 4 CPU_SET
#include<stdlib.h>
#include<stdio.h>
#include<sys/types.h>
#include<sys/sysinfo.h>
#include<unistd.h>

#define __USE_GNU
#include<sched.h>
#include<ctype.h>
#include<string.h>

int main(int argc, char* argv[])
{
        int num = sysconf(_SC_NPROCESSORS_CONF);
        int created_thread = 0;
        int myid;
        int i;
        int j = 0;

        cpu_set_t mask;
        cpu_set_t get;

        if (argc != 2)
        {
                printf("usage : ./cpu num\n");
                exit(1);
        }

        myid = atoi(argv[1]);

        printf("system has %i processor(s). \n", num);

        CPU_ZERO(&mask);
        CPU_SET(myid, &mask);

        if (sched_setaffinity(0, sizeof(mask), &mask) == -1)
        {
                printf("warning: could not set CPU affinity, continuing...\n");
        }
        while (1)
        {

                CPU_ZERO(&get);
                if (sched_getaffinity(0, sizeof(get), &get) == -1)
                {
                        printf("warning: cound not get cpu affinity, continuing...\n");
                }
                for (i = 0; i < num; i++)
                {
                        if (CPU_ISSET(i, &get))
                        {
                                printf("this process %d is running processor : %d\n",getpid(), i);
                        }
                }
        }
        return 0;
}
code reference [精彩] 发一个多CPU中进程与CPU绑定的例子 Refs: Linux RCU机制详解 Read-copy-update

2010年9月1日 星期三

NetWork on Chip @ c emulator

Hi all, We write a sample NOC emulator @ pthread c code. it support the multi tasks,such as Receiver and Transmitter at each Net-Nodes, and we add some ideas from AXI Bus. we use two channels Design to handle the Address and Data Phase,that can increase the performance and reduce power consumed. But in current version we only support the Address Phase,you can add the Data phase detection in it.thx 1.NOC flow chart Architecture. Define the Architecture set(Map Table),it includes the connection of Net-Nodes and each Nodes information(FIFO INDEX, EMPTY, FULL)... Task List Define how many jobs should do..,and it includes our definition tags. Node Trace Tracing the next node and detecting finish or not. NOC Architecture view parts of network.c
void *SetAddrInf2NetNodeId_0(void *t){
     int NodeId = (int)t;
     int cot;

while( CheckOwnTaskListAddrDone(NodeId) == NET_FALSE ){
    cot =3;
     while( NetNode[NodeId].Addr_FULL == NET_TRUE ){
            sleep( NetNode[NodeId].Addr_DELAY );
            if( cot== 0 ){ printf("Out-of-Time Wait 4 NetNode Set Addr Phase @ %d \n",NodeId); break; }
            cot--;
     }

     if( cot >0 ){
        pthread_mutex_lock(&count_mutex);
        if ( CheckTaskListAndSetAddrInf2NetNode(NodeId) == NET_OK_TASK ){
                printf("Set TaskList 2 NetNode Ok @ %d \n",NodeId);
                if(NET_DEBUG==0){ DisplayMapTable4NetNode();}
        }
        pthread_mutex_unlock(&count_mutex);
     } else {
           sleep( NetNode[NodeId].Addr_DELAY );
    }

    sleep( NetNode[NodeId].Addr_DELAY );
 }
 pthread_exit(NULL);
}

void *GetAddrInf2NetNodeId_0(void *t){
     int NodeId = (int)t;
     int cot;

while( CheckOwnTaskListAddrDone(NodeId) == NET_FALSE ){
    cot =3;
    while( NetNode[NodeId].Addr_EMPTY == NET_TRUE ){
           sleep( NetNode[NodeId].Addr_DELAY );
           if( cot== 0){ printf("Out-of-Time Wait 4 NetNode Get Addr Phase @ %d \n",NodeId); break; }
           cot--;
   }

   if( cot >0 ){
        pthread_mutex_lock(&count_mutex);
        if (CheckAddrInfNetNode2TaskList(NodeId) == NET_OK_TASK ){
               printf("Get NetNode 2 TaskList Ok @ %d\n",NodeId);
               if(NET_DEBUG==0 ){ DisplayMapTable4NetNode(); }
               if(TASK_DEBUG==0){ DisplayTaskList();         }
        }
        pthread_mutex_unlock(&count_mutex);

  } else {
      sleep( NetNode[NodeId].Addr_DELAY );
  }

   sleep( NetNode[NodeId].Addr_DELAY );
 }

 pthread_exit(NULL);
}
Results Set TaskList 2 NetNode Ok @ 0 Get NetNode 2 TaskList Ok @ 0 TId :: 0,NId :: 4,FromAddr :: 400,ToAddr :: 400,RWType :: 5,DepTId :: -1,AddrDone :: 0 TId :: 1,NId :: 3,FromAddr :: 300,ToAddr :: 400,RWType :: 6,DepTId :: 0,AddrDone :: 1 TId :: 2,NId :: 7,FromAddr :: 700,ToAddr :: 800,RWType :: 6,DepTId :: -1,AddrDone :: 1 TId :: 3,NId :: 0,FromAddr :: 0,ToAddr :: 400,RWType :: 5,DepTId :: -1,AddrDone :: 1 --------------------------------------- Get NetNode 2 TaskList Ok @ 4 TId :: 0,NId :: 4,FromAddr :: 400,ToAddr :: 400,RWType :: 5,DepTId :: -1,AddrDone :: 0 TId :: 1,NId :: 3,FromAddr :: 300,ToAddr :: 400,RWType :: 6,DepTId :: 0,AddrDone :: 1 TId :: 2,NId :: 7,FromAddr :: 700,ToAddr :: 800,RWType :: 6,DepTId :: -1,AddrDone :: 1 TId :: 3,NId :: 0,FromAddr :: 0,ToAddr :: 400,RWType :: 5,DepTId :: -1,AddrDone :: 1 --------------------------------------- code download here... Refs: NetWork on Chip @c