2012年12月23日 星期日

Software Transactional Memory


thread lock 

def f(list1, list2):
    acquire_all_locks(list1.lock, list2.lock)
    x = list1.pop()
    list2.append(x)
    release_all_locks(list1.lock, list2.lock)

Software Transactional Memory (STM) 
def f(list1, list2):
    while True:
        t = transaction()
        x = list1.pop(t)
        list2.append(t, x)
        if t.commit():
            break

http://morepypy.blogspot.tw/2012/08/multicore-programming-in-pypy-and.html


http://morepypy.blogspot.tw/2011/06/global-interpreter-lock-or-how-to-kill.html

http://en.wikipedia.org/wiki/Hardware_transactional_memory

http://cs.brown.edu/courses/csci1610/papers/stm.pdf

http://morepypy.blogspot.tw/2011/08/we-need-software-transactional-memory.html

2012年11月5日 星期一

memory usage real time profile in python

簡單的 real time mem usage 範例, 利用 guppy 來 trace memory 的 utility.

explore memory utility test case "heapy.py"

from guppy import hpy
#import pickle
import gc
import sys
import time
import re
import pprint

#--------------------------------------
ppHP = hpy()
reTOTALSIZE = re.compile("Total\s+size\s+=\s+(\d+)\s+bytes",re.M)

def parser_mem_profile(funcnm=None,memprof=None):
    """ parser mem profile
    params:
        funcnm :  func name
        memprof:  table from hpy.heap() report
    return:
        print("funcnm","mem usage")
    """
    #print ("MEMPROF",funcnm,reTOTALSIZE.findall(memprof)[0])
    # interface for driveGnuPlotStreams.pl
    print  '0:%.3f' %(float(reTOTALSIZE.findall(memprof)[0])/100000)



def parser_gc_profile(funcnm=None,gcprof=None):
    """ parser gc profile
    params:
        funcnm :  func name
        gcprof :  table from gc.collect() report
    return:
        print("funcnm","num of obj to be collected")
    """
    #print ("GCPROF",funcnm, gcprof)
    # interface for driveGnuPlotStreams.pl
    #print '1:' + str(gcprof)



def decodrator_mem_profile(): #decorator **kwargs
    """ decorator mem usage profile collect """

    def inner(func):

        def wrapper(*args, **kwargs): # func *args, **kwargs

            retval = func(*args, **kwargs)

            def _store_mem_profile():
                """ store mem profile """
                parser_mem_profile(funcnm=str(func.__name__),\
                                   memprof=str(ppHP.heap()))

            def _store_gc_profile():
                """ store garbage collect """
                parser_gc_profile(funcnm=str(func.__name__),\
                                  gcprof=str(gc.collect()))

            _store_mem_profile()
            _store_gc_profile()

        return wrapper
    return inner

#---------------------------------------------

class Pattern(object):

    def __init__(self):
        self._list = []

    def run(self):
        self._list = [ i*200 for i in range(10000) ]

    def clear(self):
        self._list = None

ppPattern = None

@decodrator_mem_profile()
def run_pattern0(wait=0.1):
    """ run test pattern0 """
    global ppPattern
    ppPattern = Pattern()
    ppPattern.run()
    time.sleep(wait)

@decodrator_mem_profile()
def free_pattern0(wait=0.1):
    """ free test pattern0 """
    global ppPattern
    ppPattern.clear()
    del ppPattern
    time.sleep(wait)


def run_test():
    """ run all tests for mem profile """

    for i in xrange(100):
        run_pattern0()
        free_pattern0()


def main():
    """ gen the PIPE test pattern to heapy_gnuplot
    ex:
        os.system((python heapy.py) | (python heapy_gnuplot))
    """

    run_test()


if __name__== "__main__":
    main()



下載 driveGnuPlotStreams.pl 當成我們的 plot func.

ps: 利用 PIPE 的方式, 把 python heay.py IO 端的 Info 存到下個 driveGnuPlotStreams.pl 的 input.

run command
(python heapy.py ; read) | perl ./driveGnuPlotStreams.pl 1 1 50 0 50 500x300+0+0 'mem' 0


results:


Downloads:
https://docs.google.com/open?id=0B35jh2lwIpeKVVJQa0dMZHhPckE
https://docs.google.com/open?id=0B35jh2lwIpeKRkpVcnVTb1VKdnM

Refs:
heapy tutorial
http://www.smira.ru/wp-content/uploads/2011/08/heapy.html
http://guppy-pe.sourceforge.net/#Heapy

real time plot(Gnuplot)
http://users.softlab.ece.ntua.gr/~ttsiod/gnuplotStreaming.html
http://www.lysium.de/blog/index.php?/archives/234-Plotting-data-with-gnuplot-in-real-time.html

real time plot(matplotlib)
http://matplotlib.org/
install
http://matplotlib.org/users/installing.html#manually-installing-pre-built-packages

2012年11月4日 星期日

garbage collection in pyhton

key words


Mark and sweep (classic mark-and-sweep implementation)


Semispace copying
(two-arena garbage collection, copying of alive objects into the other arena
happens when the active arena is full)




Generational GC (implemented as a
subclass of the Semispace copying GC, this one adds two-generation garbage
collection to distinguish between short-lived and long-living objects)

Hy-
brid GC (adding another generation to handle large objects), Mark &
Compact GC (with in-place compaction to save space, but using multiple
passes) and the Minimark GC (a combination of the previous methods,
rewritten and with a custom allocator).



Refs:
http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
http://fcamel-life.blogspot.tw/2011/12/cpython-garbage-collection.html
http://blog.ez2learn.com/2010/04/22/memory-efficient-python-with-bytearray/
http://blogs.msdn.com/b/abhinaba/archive/2009/01/25/back-to-basic-series-on-dynamic-memory-management.aspx

2012年11月2日 星期五

multiprocess + queue in python

寫了個簡單的 verilog always block simulator. 用到 multiprocessing, task queue, decorator 的技巧.
# --*-- utf8 --*--

import timeit
from collections import defaultdict
from multiprocessing import Process, Manager
import time

# global define
TASKQUEUE  = None
PROFILE    = None
STIMULATORS = defaultdict(list)

#class

class Task(object):
    """ atomic basic element for Task assign """

    def __init__(self, func, sensitives, *args, **kwargs):
        """ init
        params:
        func :     link 2 func ptr
        sensitive : (sensitive, condition)
        time : current simulation time
        ex:
        ">>>
        @decorator_block
        def add()
        ....
        Task(func, (clk,1))
        """
        self.func = func
        self.sensitives = sensitives
        self.time = 0
        self.args = args
        self.kwargs = kwargs

    def purge(self):
        self.func = None
        self.sensitives = None
        self.time = None
        self.args = None
        self.kwargs = None

    def __repr__(self):
        return "%s(Time:%s, Func:%s, Sensitive:%s, args=%s, kwargs=%s)" \
                %(self.__class__,\
                self.time,\
                self.func,\
                self.sensitives,\
                self.args,\
                self.kwargs)

    def isUpdateable(self, other):
        """  isUpdateable if other(time) > cur(time) """
        if other > self.time:
            return True
        else:
            return False


    def update(self, time=None):
        """ update """
        self.time = time


class TaskQueue(object):
    """ register each Task in TaskQueue """

    tasks = []

    def __init__(self):
        """ init """
        self.tasks = []

    def register(self, task):
        """ add task in task queue """
        assert(isinstance(task, Task))
        self.tasks.append(task)

    def dump(self):
        """ update task in task queue """
        print self.tasks

    def purge(self):
        """ purge """
        [it.purge() for it in self.tasks if it != None]

    def get(self):
        """ get tasks """
        return self.tasks


#-------------------------------------------------------

TASKQUEUE = TaskQueue()
PROFILE   = {}

def decodrator_block(**kwargs): #decorator **kwargs
    """ decorator profile collect all blocks and register it to TaskQueue """

    global TASKQUEUE

    sensitives = list(tuple(kwargs.items()))

    def inner(func):

        def wrapper(*args, **kwargs): # func *args, **kwargs

            def _store_cProfile():
                """ store cProfile """
                t = timeit.Timer()
                retval = func(*args, **kwargs)
                PROFILE[func.__name__] = str(t.timeit())

            def _store_TaskQueue():
                """ store TaskQueue """
                task = Task(func, sensitives, *args, **kwargs)
                TASKQUEUE.register(task)

            _store_cProfile()
            _store_TaskQueue()

        return wrapper
    return inner


def getValuableTasks(trigger=None):
    """ get valuable Tasks """

    global TASKQUEUE
    fit = []

    for task in TASKQUEUE.get():
        for sensitive in task.sensitives:
            if sensitive[0] == trigger:
                fit.append(task)

    return fit



def runParallelTasks(time=None,trigger=None):
    """ run each Task when the time and trigger are matched """

    valuableTasks = getValuableTasks(trigger=trigger)

    procs = []

    for valuableTask in valuableTasks:
        p = Process(target=valuableTask.func, \
                    args=valuableTask.args, \
                    kwargs=valuableTask.kwargs)
        p.start()
        procs.append(p)

    for proc in procs:
        proc.join()



#--------------------------------------------------------
def DesignUnderTest():
    """ DUT """
# as the same as verilog always block
#>>> always@(clk1.pos) begin
#>>>    a <= a+1
#>>> end

    @decodrator_block(clk1=True, clk2=True)
    def ADD0(a):
        print "@time %0.2f decodrator_block_ADD0 %d = %d + 1" %(time.time(), a+1, a)
        return a+1
#
# as the same as verilog always block
##>>> always@(clk2.pos) begin
##>>>    b <= b*3
##>>> end

    @decodrator_block(clk2=True)
    def MUX1(b):
        print "@time %0.2f decodrator_block_MUX1 %d = %d * 3" %(time.time(), b*3, b)
        return b*3

    ADD0(3)
    MUX1(3)


def preTest():
    """ pre simulation test env  """

    global STIMULATORS

    for i in range(4):
        STIMULATORS[i*2].append("clk1")
        STIMULATORS[i*4].append("clk2")

def runTest():
    """ run simulation test env """

    times = sorted(STIMULATORS.keys())
    for time in times:
        triggers = STIMULATORS[time]
        for trigger in triggers:
            runParallelTasks(time=time,trigger=trigger)


def rptTest():
    """ report simulation test env """


if __name__ == "__main__":
    DesignUnderTest()
    preTest()
    runTest()
    rptTest()







code download : https://docs.google.com/open?id=0B35jh2lwIpeKTlpEeENPNkxlNlU

Refs:
http://docs.python.org/2/library/multiprocessing.html

2012年10月31日 星期三

VIM + python


Refs
https://github.com/b4winckler/macvim
https://github.com/crosbymichael/.dotfiles
http://blog.othree.net/log/2010/11/22/vim-for-python/

decorator + cprofile +sqlite3 python

Refs:
http://stackoverflow.com/questions/5375624/a-decorator-that-profiles-a-method-call-and-logs-the-profiling-result
http://www.doughellmann.com/PyMOTW/profile/
http://www.artima.com/weblogs/viewpost.jsp?thread=240808 http://docs.python.org/dev/library/functools.html

Download:
https://docs.google.com/open?id=0B35jh2lwIpeKb3R0Mnp6ckhGZTA

2012年10月30日 星期二

python table HDF5

The HDF5 library is a versatile, mature library designed for the storage of numerical data. The h5py package provides a simple, Pythonic interface to HDF5. A straightforward high-level interface allows the manipulation of HDF5 files, groups and datasets using established Python and NumPy metaphors. HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accesed using the tradional POSIX /path/to/resource syntax. Refs: example https://github.com/qsnake/h5py/tree/master/h5py/tests http://code.google.com/p/h5py/ http://alfven.org/wp/hdf5-for-python/ http://pytables.github.com/usersguide/

2012年10月19日 星期五

nose unittest extender

nose 一個好用的 unittest manager, 讓每隻的 testsuits 都可以連結或者個是分開個別來測, 測完後有個coverage report. 很適合拿來做 regression 的驗證. 底下就是個非常簡單的例子

dut.py 我們所寫好的 method
""" dut """

__all__ = ['frame']

class frame(object):

    def __init__(self):
     self.name = "frame"

    def double(self,w):
     return w * 2

    def triple(self,w):
     return w * 3
test_dut.py 測試寫的 method
 
import os
import unittest

import nose

from dut import *
import gc

class TestDouble(unittest.TestCase):

    def setUp(self):
        self.frame = frame()

    def teardown(self):
     self.frame = None
     gc.collect()


    @unittest.skip("calling test skip test_double_word")
    def test_double_word(self):
     """ test double word """

     expect  = "hihi"
     results = self.frame.double("hi")
     self.assertTrue(expect == results)


    def test_double_dec(self):
     """ test double dec """

     expect = 4.0
     results= self.frame.double(2.0)
     self.assertTrue(expect == results)


if __name__ == '__main__':
    # unittest.main()
    import nose
    # nose.runmodule(argv=[__file__,'-vvs','-x', '--ipdb-failure'],
    #                exit=False)
    nose.runmodule(argv=[__file__,'-vvs','-x','--pdb', '--pdb-failure'],
                   exit=False)
__init__.py path include

runtest.sh top test run manager

#!/bin/sh
coverage erase
nosetests -w ./ --with-coverage --cover-package=dut $*

how to run it
>>> ./runtest.sh 

test results
>>> results .S ---------------------------------------------------------------------- Ran 2 tests in 0.001s OK (SKIP=1)

2012年10月17日 星期三

MongoDB VS SQL performance benchmark

performance benchmark
tips :
Memcached obviously wins the competition as it des not have to sync anything to disk. Surprisingly MongoDB beats it in small dataset inserts!! i guess it is becayse mongodb driver uses binary protocol and performs fire-and-forget inserts by default (unsafe mode). In addition MongoDB does not enforce sync to disk so a lot of writes are kept in memory. Thats why it does so well on inserts of small rows.

SQL requires joins, joins are slow. MongoDB is fast in large part because it doesn’t use joins (most of the time).

Refs:
http://blog.michaelckennedy.net/2010/04/29/mongodb-vs-sql-server-2008-performance-showdown/ 

http://tobami.wordpress.com/2011/02/28/benchmarking-mongodb/ 

http://stackoverflow.com/questions/4465027/sql-server-and-mongodb-comparisonhttp://atlantischiu.blog.ithome.com.tw/post/3058/110773 

http://zh.scribd.com/doc/28862327/MongoDB-High-Performance-SQL-Free-Database

2012年10月16日 星期二

panda for data access

panda 可以方便對大亮的資料處理,也提供很多查找的方式.例如 join, split, groupby, columns/rows select, hierarchical index supported... 雖然有點類似 SQL 的想法. 當相對的比 SQL 來的快速跟方便. 不要自己再多寫很多的 query 方式, 且可支援 cvs, HDF5(pytable) compression, json, ... 的資料格式. ex: 改寫 panda 的範例 算出 move avage ....
"""
Some examples playing around with yahoo finance data
"""

from datetime import datetime

import matplotlib.finance as fin
import numpy as np
from pylab import show
import pprint

from pandas import Index, DataFrame
from pandas.core.datetools import BMonthEnd
from pandas import ols

startDate = datetime(2009, 9, 1)
endDate = datetime(2009, 9, 10)

def getQuotes(symbol, start, end):
    quotes = fin.quotes_historical_yahoo(symbol, start, end)
    dates, open, close, high, low, volume = zip(*quotes)

    data = {
        'open' : open,
        'close' : close,
        'high' : high,
        'low' : low,
        'volume' : volume
    }

    dates = Index([datetime.fromordinal(int(d)) for d in dates])
    return DataFrame(data, index=dates)


def getMoveAvage(frame, label='close', mvavg=5):
    """ get move avage """

    assert(label in ['open', 'close', 'high', 'low', 'volume'])

    avgs    = []

    for indx, val in enumerate(frame.index):
        tot_sum = 0.0

        if indx > mvavg and mvavg >0:
            for i in range(mvavg):
                tot_sum += frame[label][indx-i]

            avgs.append(tot_sum/mvavg)

        else:
            avgs.append(0.0)

    data = {
            "%s_avg_%s" %(label,mvavg)  : avgs
            }

    return DataFrame(data, index=frame.index)


msft = getQuotes('MSFT', startDate, endDate)
msft_close_mv5 = getMoveAvage(msft, 'close', 5)
msft_open_mv5 = getMoveAvage(msft, 'open', 5)

new_msft = msft.join(msft_close_mv5)
print new_msft
用 np.sum 來加速 減少 memory access times
....

def getMoveAvage2(frame, label='close', mvavg=5):
    """ get move avage """

    assert(label in ['open', 'close', 'high', 'low', 'volume'])

    avgs    = []

    for indx, val in enumerate(frame.index):
        tot_sum = 0.0

        if indx > mvavg and mvavg >0:
            tot_sum = np.sum(frame[label][indx-mvavg+1:indx+1])

            avgs.append(tot_sum/mvavg)

        else:
            avgs.append(0.0)

    data = {
            "%s_avg_%s" %(label,mvavg)  : avgs
            }

    return DataFrame(data, index=frame.index)

#--------------------------------------

msft = getQuotes('MSFT', startDate, endDate)

profile.run("getMoveAvage(msft, 'close', 5)", 'status0')
p0 = pstats.Stats('status0')
p0.sort_stats('time', 'cumulative').print_stats(5)

profile.run("getMoveAvage2(msft, 'close', 5)", 'status1')
p1 = pstats.Stats('status1')
p1.sort_stats('time', 'cumulative').print_stats(5)

rst0 = eval("getMoveAvage(msft, 'close', 5)")
rst1 = eval("getMoveAvage2(msft, 'close', 5)")

refs: http://pandas.pydata.org/pandas-docs/dev/ http://www.pytables.org/moin

2012年9月24日 星期一

multiprocessing vs thread


 multiprocessing vs thread vs serial
http://eli.thegreenplace.net/2012/01/16/python-parallelizing-cpu-bound-tasks-with-multiprocessing/

http://bpgergo.blogspot.tw/2011/08/matrix-multiplication-in-python.html

http://docs.python.org/library/multiprocessing.html

http://eli.thegreenplace.net/2011/12/27/python-threads-communication-and-stopping/

https://github.com/jerkos/metms_final/blob/dfccf8c88372d193f64dbe4ac4b062a7f04609b3/controller/dialog/MetClusteringControl.py

https://github.com/dagss/joblib/blob/0cd09592930dc1dea6ae3fcd92bad65a5eff1774/joblib/test/stresstest_store.py

http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html


bundle method
http://stackoverflow.com/questions/1289813/python-multiprocessing-vs-threading-for-cpu-bound-work-on-windows-and-linux

2012年9月17日 星期一

python big data analysis


http://blog.wesmckinney.com/

panda + mongodb
https://github.com/pld/bamboo/blob/master/bamboo/models/observation.py


light weight performance bench
https://github.com/pydata/vbench

http://www.youtube.com/watch?v=hnhN2_TpY8g

hdf5
http://www.hdfgroup.org/tools5app.html

2012年8月6日 星期一

machine learning + python

http://scikit-learn.sourceforge.net/stable/

http://pybrain.org/

vim + python IDE


https://github.com/kaochenlong/eddie-vim

+

sudo apt-get install vim-scripts

vim-addons install taglist  
vim-addons install supertab
vim-addons install cscope
vim-addons install winmanager
vim-addons install tags

2012年8月5日 星期日

testing framework for driver


Robot Framework

http://code.google.com/p/robotframework/

staf

softwave testing automatic framework

http://staf.sourceforge.net/current/STAFPython.htm


python software testing

http://samsnyder.com/2010/09/15/python-software-testing-and-automation/

2012年8月2日 星期四

The Intelligent Transport Layer - zeromq

zmq
https://github.com/zeromq/pyzmq/blob/master/README.rst

gevent 
http://www.gevent.org/

2012年8月1日 星期三

vpython 3d image api

http://www.vpython.org/contents/docs/visual/index.html

Mock - Mocking and Testing Library

mock is a library for testing in Python. It allows you to replace parts of your system under test with mock objects and make assertions about how they have been used.
ref : http://www.voidspace.org.uk/python/mock/index.html#quick-guide

2012年6月27日 星期三

python template

python
http://wiki.python.org/moin/Templating

ex:


  <%
  for item in items:
    %>
    <%
  %>
Name <%= item.name %>




ruby :: erb lib

python interface for c

swig
http://www.swig.org/Doc1.3/Python.html

boost python
http://www.boost.org/doc/libs/1_49_0/libs/python/doc/

python ctype
inline

2012年3月4日 星期日

IPXACT + UVM


Automatic generation of OVM/UVM registers

http://www.duolog.com/solutions/industry-standards/

http://www.maojet.com.tw/whitepaper/main.jspx 

http://www.duolog.com/wp-content/uploads/DVCON_2012_3_IP-XACT_and_UVM.pdf


Benefits of using these standards together
 IP-XACT can be a single source specification for IP metadata
- Specification is standardized and leads to :
- Less ambiguity
- Higher quality because of SCR checks
- Higher levels of automation through generators
- High levels of interoperability
 UVM provides advanced verification capabilities.
- High level of HW/SW verification capability using the built-in UVM test
sequences
- Randomization, phasing, coverage, scoreboard
 If we can leverage the two standard s we can get significant levels of
verification automation and productivity
- Does IP-XACT link well with UVM?
- Focus : Let’s investigate HW/SW interface verification