2015年12月16日 星期三

Grizzly 2.0: HttpServer API. Asynchronous HTTP Server - Part I

注意事項:
    1. 本文轉載於 Mytec Blog for oleksiys
    2. 建議先瞭解一下 Grizzly IOStrategies 的第一種 Worker-thread 模式
    3. 僅供參考查閱使用
In my previous blog entry I described basic Grizzly HttpServer API abstractions and offered a couple of samples to show how one can implement light-weight Servlet-like web application.
Here I'll try to show how we can process HTTP requests asynchronously within HttpHandler, in other words implement asynchronous HTTP application.
What do we mean by "asynchronous"?
Normally HttpServer has a service thread-pool, whose threads are used for HTTP requests processing, which includes following steps:
  1. parse HTTP request;
  2. execute processing logic by calling HttpHandler.handle(Request, Response);
  3. flush HTTP response;
  4. return service thread to a thread-pool.
Normally the steps above are executed sequentially in a service thread. Using "asynchronous" feature it's possible to delegate execution of steps 2 and 3 to a custom thread, which will let us release the service thread faster.
Why would we want to do that?
As it was said above, the service thread-pool instance is shared between all the HttpHandlers registered on HttpServer. Assume we have application (HttpHandler) "A", which executes a long lasting task (say a SQL query on pretty busy DB server), and application "B", which serves static resources. It's easy to imagine the situation, when couple of application "A" clients block all the service threads by waiting for response from DB server. The main problem is that clients of application "B", which is pretty light-weight, can not be served at the same time because there are no available service threads. So it might be a good idea to isolate these applications by executing application "A" logic in the dedicated thread pool, so service threads won't be blocked.
Ok, let's do some coding and make sure the issue we've just described is real.
HttpServer httpServer = new HttpServer();

NetworkListener networkListener = new NetworkListener("sample-listener", "127.0.0.1", 18888);

// Configure NetworkListener thread pool to have just one thread,
// so it would be easier to reproduce the problem
ThreadPoolConfig threadPoolConfig = ThreadPoolConfig
        .defaultConfig()
        .setCorePoolSize(1)
        .setMaxPoolSize(1);

networkListener.getTransport().setWorkerThreadPoolConfig(threadPoolConfig);

httpServer.addListener(networkListener);

httpServer.getServerConfiguration().addHttpHandler(new HttpHandler() {
    @Override
    public void service(Request request, Response response) throws Exception {
        response.setContentType("text/plain");
        response.getWriter().write("Simple task is done!");
    }
}, "/simple");

httpServer.getServerConfiguration().addHttpHandler(new HttpHandler() {
    @Override
    public void service(Request request, Response response) throws Exception {
        response.setContentType("text/plain");
        // Simulate long lasting task
        Thread.sleep(10000);
        response.getWriter().write("Complex task is done!");
    }
}, "/complex");

try {
    server.start();
    System.out.println("Press any key to stop the server...");
    System.in.read();
} catch (Exception e) {
    System.err.println(e);
}
In the sample above we create, initialize and run HTTP server, which has 2 applications (HttpHandlers) registered: "simple" and "complex". To simulate long-lasting task in the "complex" application we're just causing the current thread to sleep for 10 seconds.
Now if you try to call "simple" application from you Web browser using URL: http://localhost:18888/simple - you see the response immediately. However, if you try to call "complex" application http://localhost:18888/complex - you'll see response in 10 seconds. That's fine. But try to call "complex" application first and then quickly, in different tab, call the "simple" application, do you see the response immediately? Probably not. You'll see the response right after "complex" application execution completed. The sad thing here is that service thread, which is executing "complex" operation is idle (the same situation is when you wait for SQL query result), so CPU is doing nothing, but still we're not able to process another HTTP request.
How we can rework the "complex" application to execute its task in custom thread pool? Normally application (HttpHandler) logic is encapsulated within HttpHandler.service(Request, Response) method, once we exit this method, Grizzly finishes and flushes HTTP response. So coming back to the service thread processing steps:
  1. parse HTTP request;
  2. execute processing logic by calling HttpHandler.handle(Request, Response);
  3. flush HTTP response;
  4. return service thread to a thread-pool.
we see that it wouldn't be enough just to delegate HTTP request processing to a custom thread on step 2, because on step 3 Grizzly will automatically flush HTTP response back to client at the state it currently is. We need a way to instruct Grizzly to not do 3 automatically on the service thread, instead we want to be able to perform this step ourselves once asynchronous processing is complete.
Using Grizzly HttpServer API it could be achieved following way:
  • HttpResponse.suspend(...) to instruct Grizzly to not flush HTTP response in the service thread;
  • HttpResponse.resume() to finish HTTP request processing and flush response back to client.
So asynchronous version of the "complex" application (HttpHandler) will look like:
httpServer.getServerConfiguration().addHttpHandler(new HttpHandler() {
    final ExecutorService complexAppExecutorService =
        GrizzlyExecutorService.createInstance(
            ThreadPoolConfig.defaultConfig()
            .copy()
            .setCorePoolSize(5)
            .setMaxPoolSize(5));
            
    @Override
    public void service(final Request request, final Response response) throws Exception {
                
        response.suspend(); // Instruct Grizzly to not flush response, once we exit the service(...) method 
                
        complexAppExecutorService.execute(new Runnable() {   // Execute long-lasting task in the custom thread
            public void run() {
                try {
                    response.setContentType("text/plain");
                    // Simulate long lasting task
                    Thread.sleep(10000);
                    response.getWriter().write("Complex task is done!");
                } catch (Exception e) {
                    response.setStatus(HttpStatus.INTERNAL_SERVER_ERROR_500);
                } finally {
                    response.resume();  // Finishing HTTP request processing and flushing the response to the client
                }
            }
        });
    }
}, "/complex");
  • As you might have noticed, "complex" application uses Grizzly ExecutorService implementation. This is the preferred approach, however you can still use own ExecutorService.
The three most important steps in the code above are marked red:
  1. Suspend HTTP response processing: response.suspend()
  2. Delegating task to the custom thread pool: complexAppExecutorService.execute(...)
  3. Resuming HTTP response processing: response.resume()
Now, using your browser, you can make sure "simple" and "complex" applications are not affecting each other, and the "simple" application works just fine when the "complex" application is busy.

筆記

實作中筆者刻意將 thread pool 設定為僅有一個 Worker 的情況下運行。
在此案例中,如果我們先行呼叫 complex 而後在呼叫 simple 會發現,simple 需要等 complex 做完才可以運行。
在這樣的狀況下會發生一個問題,也就是當一些複雜的 API 被呼叫時可能造成一些簡單的 Request 在排隊等待
緊接著筆者透過另起一群 Thread pool 專門處理 complex 這樣即使只有一個 Worker 也不至會造成排隊的問題。
個人對此文章的理解為:建議耗時長的API應與耗時短的API,分別處理這樣可以使伺服器運作得更為順利

Best Practices

注意事項:
    1. 本文來源為 Grizzly 官方網站文件
    2. 請以官方文件為主,此文僅供參考使用
When developing a network application, we usually wonder how we can optimize it. How should the worker thread pool be sized? Which I/O strategy to employ?
There is no general answer for that question, but we’ll try to provide some tips.
  • IOStrategy
    In the IOStrategy section, we introduced different Grizzly IOStrategies.
    By default, Grizzly Transports use the worker-thread IOStrategy, which is reliable for any possible usecase. However, if the application processing logic doesn’t involve any blocking I/O operations, the same-thread IOStrategy can be used. For these cases, the same-thread strategy will yield better performance as there are no thread context switches.
    For example, if we implement general HTTP Servlet container, we can’t be sure about nature of specific Servlets developers may have. In this case it’s safer to use the worker-thread IOStrategy. However, if application uses the Grizzly’s HttpServer and HttpHandler, which leverages NIO streams, then the same-thread strategy could be used to optimize processing time and resource consumption;
  • Selector runners count
    The Grizzly runtime will automatically set the SelectorRunner count value equal to Runtime.getRuntime().availableProcessors(). Depending on the usecase, developers may change this value to better suit their needs.
    Scott Oaks, from the Glassfish performance team, suggests that there should be one SelectorRunner for every 1-4 cores on your machine; no more than that;
  • Worker thread pool
    In the Configuration threadpool-config section, the different thread pool implementations, and their pros and cons, were discussed.
    All IOStrategies, except the same-thread IOStrategy, use worker threads to process IOEvents which occur on Connections. A common question is how many worker threads will be needed by an application?
    In his blog, Scott suggests How many is “just enough”? It depends, of course – in a case where HTTP requests don’t use any external resource and are hence CPU bound, you want only as many HTTP request processing threads as you have CPUs on the machine. But if the HTTP request makes a database call (even indirectly, like by using a JPA entity), the request will block while waiting for the database, and you could profitably run another thread. So this takes some trial and error, but start with the same number of threads as you have CPU and increase them until you no longer see an improvement in throughput.
    Translating this to the general, non HTTP usecase: If IOEvent processing includes blocking I/O operation(s), which will make thread block doing nothing for some time (i.e, waiting for a result from a peer), it’s best to have more worker threads to not starve other request processing. For simpler application processes, the fewer threads, the better.

筆記

  • Grizzly 建議的調整方向可區分成三個 IOStrategy, Selector 和 Worker thread pool。
    • IOStrategy 調整伺服器運作模式,包含 Worker-thread(預設), Same-thread, Dynamic 和 Leader-follower。
    • Selector 建議配合電腦核心數去設定,每 1 ~ 4個核心對應 1 個 Selector(此預設值為 Runtime.getRuntime().availableProcessors() 的數值)
    • 官方提供建議 thread pool 設定為夠用就好並非越大就越快,此外 Worker thread pool 必須配合 IOStrategy 決定是否有效。如:Same-thread 模式中似乎並沒有運用到 Worker。

2015年10月29日 星期四

Java 之神奇的 1+1+1+1+1 = 1

注意事項:
    1. 文章內容為閱讀相關資訊後所做的測試結果及心得
    2. 可供參考、分享及測試
下面是某程式的一部分,表達的只是 5 個 1 相加,結果卻會是 1。
Integer a = 1;
Integer b = new Integer(1);
Integer c = Integer.valueOf(1);
Integer d = Integer.parseInt("1");
Integer e = Long.valueOf(1).intValue();

System.out.println("a : " + a);
System.out.println("b : " + b);
System.out.println("c : " + c);
System.out.println("d : " + d);
System.out.println("e : " + e);

System.out.println("a+b+c+d+e : " + (a+b+c+d+e)); // output 'a+b+c+d+e : 1'
在看解答前不仿先嘗試猜猜看原因為何。
.
.
.
.
.
.
.
完整程式碼 如下:
package sample.puzzler;

import java.lang.reflect.Field;

/**
 * Created by Cookie on 15/10/29.
 */
public class JavaPuzzler {

    public static void main(String[] args) throws IllegalAccessException, NoSuchFieldException {

        Integer one = 1;

        Field field = Integer.class.getDeclaredField("value");
        field.setAccessible(true);
        field.setInt(one, 0);

        Integer a = 1;
        Integer b = new Integer(1);
        Integer c = Integer.valueOf(1);
        Integer d = Integer.parseInt("1");
        Integer e = Long.valueOf(1).intValue();

        System.out.println("a : " + a);
        System.out.println("b : " + b);
        System.out.println("c : " + c);
        System.out.println("d : " + d);
        System.out.println("e : " + e);

        System.out.println("a+b+c+d+e : " + (a+b+c+d+e));

    }
}
Console Output
a : 0
b : 1
c : 0
d : 0
e : 0
a+b+c+d+e : 1
在程式後半段宣告 a ~ e 並執行 a+b+c+d+e 應該為 5,可執行結果卻為 1,究竟是什麼原因會造成這樣的結果呢?

程式解讀

Field field = Integer.class.getDeclaredField("value");
這是利用 Java 反射機制來取得 Integer 中名叫 value 的欄位。
field.setAccessible(true);
field.setInt(one, 0);
再來的兩行,分別是將存取權限改成 true,之後將整數 0 放入 one 這個物件的 value 欄位之中。
// 宣告 a ~ e
Integer a = 1;
Integer b = new Integer(1);
Integer c = Integer.valueOf(1);
Integer d = Integer.parseInt("1");
Integer e = Long.valueOf(1).intValue();

// 輸出每一個變數的數值
System.out.println("a : " + a);
System.out.println("b : " + b);
System.out.println("c : " + c);
System.out.println("d : " + d);
System.out.println("e : " + e);

// 計算 a+b+c+d+e 然後輸出到畫面
System.out.println("a+b+c+d+e : " + (a+b+c+d+e));
後面幾行如一開始所說的是宣告 a ~ e 的變數,之後做輸出及加總後輸出。

認識 IntegerCache

IntegerCache 是 Integer 物件中的一個 Inner Class。如其名是用來存放 Integer 的 Cache 以達到加速的效果,IntegerCache 程式碼 內容如下:
private static class IntegerCache {
    static final int low = -128;
    static final int high;
    static final Integer cache[];

    static {
        // high value may be configured by property
        int h = 127;
        String integerCacheHighPropValue =
            sun.misc.VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
        if (integerCacheHighPropValue != null) {
            int i = parseInt(integerCacheHighPropValue);
            i = Math.max(i, 127);
            // Maximum array size is Integer.MAX_VALUE
            h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
        }
        high = h;

        cache = new Integer[(high - low) + 1];
        int j = low;
        for(int k = 0; k < cache.length; k++)
            cache[k] = new Integer(j++);
    }

    private IntegerCache() {}
}

觀察 Integer.valueOf()

public static Integer valueOf(int i) {
    assert IntegerCache.high >= 127;
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}

結論

在實作 Integer 的工程師,為了提高 Integer 物件的效能,定義了一段 Cache 的規則,大致上內容為如果所宣告的數值 -128 ~ 127 之間,將直接從 integerCache 中直接取得相對應的物件。
Example:
Integer one = 1;
Integer a = 1;
Integer c = Integer.valueOf(1);
上列三個參考皆會指向到相同的 Integer 物件並不會不同,除非是自行宣告為 new Integer(1),否則是不會產生新物件的。所以當使用field.setInt(one, 0);將 one 所指向的物件進行修改。由於 a 和 c 也指在相同物件上自然也就會被修改到。

2015年10月28日 星期三

Java Sample Trigger and Masker

注意事項:
    1. 文章內容為使用二進制實現開關效果
    2. 利用一個整數來存取大量開關
    3. 文章及程式碼皆為 CookieTsai 撰寫,可供參考及分享

目錄

前言

有些時候,在 Coding 時會面臨需要使用大量開關來處理的問題。
在這時候程式碼會使用大量 if else 造成不易閱讀及不易維護。除此之外,在儲存這些設定時可能會需要用到許多欄位。過多的判斷式及欄位將造成維護上的困擾。
因此,這幾天嘗試實作了一個叫 Trigger 還有 Masker 的兩種物件。Trigger and Masker 是利用一個整數來便於管理及儲存這大量的開關的設計。
文章的 Start Demo 是實作 SampleMasker 及 SampleMain 來管理五個開關。

Defined Trigger and Masker

Trigger Java Class

package sample.trigger;

/**
 * Created by Cookie on 15/10/29.
 */
public class Trigger {
    public static final int ALL_ON  = -1;
    public static final int ALL_OFF = 0;

    private int value;

    public Trigger(Masker... maskers) {
        for (Masker masker: maskers) {
            on(masker);
        }
    }

    public void on(Masker... maskers){
        for (Masker masker: maskers) {
            value = (value | masker.getMask());
        }
    }

    public void off(Masker... maskers){
        for (Masker masker: maskers) {
            if (!isOn(masker)) {
                continue;
            }
            value -= masker.getMask();
        }
    }

    public boolean isOn(Masker masker){
        return (value & masker.getMask()) > 0;
    }

    public int getValue() {
        return value;
    }

    @Override
    public String toString() {
        return String.valueOf(getValue());
    }
}

Masker Interface

package sample.trigger;

/**
 * Created by Cookie on 15/10/29.
 */
public interface Masker {
    boolean isOn(int value);
    int getIndex();
    int getMask();
}

Start Demo

Sample Masker Java Class

package sample.trigger;

/**
 * Created by Cookie on 15/10/29.
 */
public enum  SampleMasker implements Masker {

    OPTION_01(1),
    OPTION_02(2),
    OPTION_03(3),
    OPTION_04(4),
    OPTION_05(5);

    private final int index;
    private final int mask;

    SampleMasker(int index) {
        this.index = index;
        this.mask = (1 << (index -1));
    }

    public int getIndex() {
        return index;
    }

    public int getMask() {
        return mask;
    }

    public boolean isOn(int value){
        return (value & getMask()) > 0;
    }

    public static Trigger newTrigger() {
        return new Trigger();
    }

    public static Trigger newTrigger(Integer value) {
        if (value == null) {
            return new Trigger(values());
        }
        Trigger result = new Trigger();
        for (Masker masker: values()) {
            if (masker.isOn(value)) {
                result.on(masker);
            }
        }
        return result;
    }

    @Override
    public String toString() {
        return this.name();
    }
}

Sample Main Java Class

package sample.trigger;

/**
 * Created by Cookie on 15/10/29.
 */
public class SampleMain {

    public static void main(String[] args) {

        Trigger trigger;

        System.out.println("========== Sample 1 output ==========");

        // Create a new Trigger
        trigger = SampleMasker.newTrigger();

        // Open the option 02 and 05
        trigger.on(SampleMasker.OPTION_02, SampleMasker.OPTION_05);

        // show all option
        for (Masker masker: SampleMasker.values()) {
            System.out.println(masker.toString() + " => " + trigger.isOn(masker) );
        }

        // show the trigger value
        System.out.println("Get Trigger Value => " + trigger.getValue());
        System.out.println();

        System.out.println("========== Sample 2 output ==========");

        // Create a new Trigger,and all option be on
        trigger = SampleMasker.newTrigger(Trigger.ALL_ON);

        // Close the option 04
        trigger.off(SampleMasker.OPTION_04);

        for (Masker masker: SampleMasker.values()) {
            System.out.println(masker.toString() + " => " + trigger.isOn(masker) );
        }

        // show the trigger value
        System.out.println("Get Trigger Value => " + trigger.getValue());
        System.out.println();

        int value = trigger.getValue();

        System.out.println("========== Sample 3 output ==========");

        // Using a trigger value to create a new Trigger
        trigger = SampleMasker.newTrigger(value);

        // show all option
        for (Masker masker: SampleMasker.values()) {
            System.out.println(masker.toString() + " => " + trigger.isOn(masker) );
        }

        // show the trigger value
        System.out.println("turn to Int Value => " + trigger.getValue());
        System.out.println();

    }
}

Console Output

========== Sample 1 output ==========
OPTION_01 => false
OPTION_02 => true
OPTION_03 => false
OPTION_04 => false
OPTION_05 => true
Get Trigger Value => 18

========== Sample 2 output ==========
OPTION_01 => true
OPTION_02 => true
OPTION_03 => true
OPTION_04 => false
OPTION_05 => true
Get Trigger Value => 23

========== Sample 3 output ==========
OPTION_01 => true
OPTION_02 => true
OPTION_03 => true
OPTION_04 => false
OPTION_05 => true
turn to Int Value => 23

2015年10月20日 星期二

Java Thread Pool Example using Executors and ThreadPoolExecutor

A thread pool manages the pool of worker threads, it contains a queue that keeps tasks waiting to get executed. A thread pool manages the collection of Runnable threads and worker threads execute Runnable from the queue. 
java.util.concurrent.Executors provide implementation of java.util.concurrent.Executor interface to create the thread pool in java.
Let’s write a simple program to explain it’s working.
First we need to have a Runnable class.
package com.journaldev.threadpool;

public class WorkerThread implements Runnable {

    private String command;

    public WorkerThread(String s){
        this.command=s;
    }

    @Override
    public void run() {
        System.out.println(Thread.currentThread().getName()+" Start. Command = "+command);
        processCommand();
        System.out.println(Thread.currentThread().getName()+" End.");
    }

    private void processCommand() {
        try {
            Thread.sleep(5000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    @Override
    public String toString(){
        return this.command;
    }
}
Here is the test program where we are creating fixed thread pool from Executors framework.
package com.journaldev.threadpool;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class SimpleThreadPool {

    public static void main(String[] args) {
        ExecutorService executor = Executors.newFixedThreadPool(5);
        for (int i = 0; i < 10; i++) {
            Runnable worker = new WorkerThread("" + i);
            executor.execute(worker);
          }
        executor.shutdown();
        while (!executor.isTerminated()) {
        }
        System.out.println("Finished all threads");
    }

}
In above program, we are creating fixed size thread pool of 5 worker threads. Then we are submitting 10 jobs to this pool, since the pool size is 5, it will start working on 5 jobs and other jobs will be in wait state, as soon as one of the job is finished, another job from the wait queue will be picked up by worker thread and get’s executed.
Here is the output of the above program.
pool-1-thread-2 Start. Command = 1
pool-1-thread-4 Start. Command = 3
pool-1-thread-1 Start. Command = 0
pool-1-thread-3 Start. Command = 2
pool-1-thread-5 Start. Command = 4
pool-1-thread-4 End.
pool-1-thread-5 End.
pool-1-thread-1 End.
pool-1-thread-3 End.
pool-1-thread-3 Start. Command = 8
pool-1-thread-2 End.
pool-1-thread-2 Start. Command = 9
pool-1-thread-1 Start. Command = 7
pool-1-thread-5 Start. Command = 6
pool-1-thread-4 Start. Command = 5
pool-1-thread-2 End.
pool-1-thread-4 End.
pool-1-thread-3 End.
pool-1-thread-5 End.
pool-1-thread-1 End.
Finished all threads
The output confirms that there are five threads in the pool named from “pool-1-thread-1? to “pool-1-thread-5? and they are responsible to execute the submitted tasks to the pool.
Executors class provide simple implementation of ExecutorService using ThreadPoolExecutor but ThreadPoolExecutor provides much more feature than that. We can specify the number of threads that will be alive when we create ThreadPoolExecutor instance and we can limit the size of thread pool and create our own RejectedExecutionHandler implementation to handle the jobs that can’t fit in the worker queue.
Here is our custom implementation of RejectedExecutionHandler interface.
package com.journaldev.threadpool;

import java.util.concurrent.RejectedExecutionHandler;
import java.util.concurrent.ThreadPoolExecutor;

public class RejectedExecutionHandlerImpl implements RejectedExecutionHandler {

    @Override
    public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
        System.out.println(r.toString() + " is rejected");
    }

}
ThreadPoolExecutor provides several methods using which we can find out the current state of executor, pool size, active thread count and task count. 
So I have a monitor thread that will print the executor information at certain time interval.
package com.journaldev.threadpool;

import java.util.concurrent.ThreadPoolExecutor;

public class MyMonitorThread implements Runnable
{
    private ThreadPoolExecutor executor;

    private int seconds;

    private boolean run=true;

    public MyMonitorThread(ThreadPoolExecutor executor, int delay)
    {
        this.executor = executor;
        this.seconds=delay;
    }

    public void shutdown(){
        this.run=false;
    }

    @Override
    public void run()
    {
        while(run){
                System.out.println(
                    String.format("[monitor] [%d/%d] Active: %d, Completed: %d, Task: %d, isShutdown: %s, isTerminated: %s",
                        this.executor.getPoolSize(),
                        this.executor.getCorePoolSize(),
                        this.executor.getActiveCount(),
                        this.executor.getCompletedTaskCount(),
                        this.executor.getTaskCount(),
                        this.executor.isShutdown(),
                        this.executor.isTerminated()));
                try {
                    Thread.sleep(seconds*1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
        }

    }
}
Here is the thread pool implementation example using ThreadPoolExecutor.
package com.journaldev.threadpool;

import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class WorkerPool {

    public static void main(String args[]) throws InterruptedException{
        //RejectedExecutionHandler implementation
        RejectedExecutionHandlerImpl rejectionHandler = new RejectedExecutionHandlerImpl();
        //Get the ThreadFactory implementation to use
        ThreadFactory threadFactory = Executors.defaultThreadFactory();
        //creating the ThreadPoolExecutor
        ThreadPoolExecutor executorPool = new ThreadPoolExecutor(2, 4, 10, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(2), threadFactory, rejectionHandler);
        //start the monitoring thread
        MyMonitorThread monitor = new MyMonitorThread(executorPool, 3);
        Thread monitorThread = new Thread(monitor);
        monitorThread.start();
        //submit work to the thread pool
        for(int i=0; i<10; i++){
            executorPool.execute(new WorkerThread("cmd"+i));
        }

        Thread.sleep(30000);
        //shut down the pool
        executorPool.shutdown();
        //shut down the monitor thread
        Thread.sleep(5000);
        monitor.shutdown();

    }
}
Notice that while initializing the ThreadPoolExecutor, we are keeping initial pool size as 2, maximum pool size to 4 and work queue size as 2. So if there are 4 running tasks and more tasks are submitted, the work queue will hold only 2 of them and rest of them will be handled by RejectedExecutionHandlerImpl.
Here is the output of above program that confirms above statement.
pool-1-thread-1 Start. Command = cmd0
pool-1-thread-4 Start. Command = cmd5
cmd6 is rejected
pool-1-thread-3 Start. Command = cmd4
pool-1-thread-2 Start. Command = cmd1
cmd7 is rejected
cmd8 is rejected
cmd9 is rejected
[monitor] [0/2] Active: 4, Completed: 0, Task: 6, isShutdown: false, isTerminated: false
[monitor] [4/2] Active: 4, Completed: 0, Task: 6, isShutdown: false, isTerminated: false
pool-1-thread-4 End.
pool-1-thread-1 End.
pool-1-thread-2 End.
pool-1-thread-3 End.
pool-1-thread-1 Start. Command = cmd3
pool-1-thread-4 Start. Command = cmd2
[monitor] [4/2] Active: 2, Completed: 4, Task: 6, isShutdown: false, isTerminated: false
[monitor] [4/2] Active: 2, Completed: 4, Task: 6, isShutdown: false, isTerminated: false
pool-1-thread-1 End.
pool-1-thread-4 End.
[monitor] [4/2] Active: 0, Completed: 6, Task: 6, isShutdown: false, isTerminated: false
[monitor] [2/2] Active: 0, Completed: 6, Task: 6, isShutdown: false, isTerminated: false
[monitor] [2/2] Active: 0, Completed: 6, Task: 6, isShutdown: false, isTerminated: false
[monitor] [2/2] Active: 0, Completed: 6, Task: 6, isShutdown: false, isTerminated: false
[monitor] [2/2] Active: 0, Completed: 6, Task: 6, isShutdown: false, isTerminated: false
[monitor] [2/2] Active: 0, Completed: 6, Task: 6, isShutdown: false, isTerminated: false
[monitor] [0/2] Active: 0, Completed: 6, Task: 6, isShutdown: true, isTerminated: true
[monitor] [0/2] Active: 0, Completed: 6, Task: 6, isShutdown: true, isTerminated: true
Notice the change in active, completed and total completed task count of the executor. We can invoke shutdown() method to finish execution of all the submitted tasks and terminate the thread pool.
Reference: Java Thread Pool Example using Executors and ThreadPoolExecutor from our JCG partner Pankaj Kumar at the Developer Recipes blog.

2015年10月12日 星期一

Learning Homebrew

注意事項:
    1. 本文來源 http://blog.lyhdev.com/2011/06/homebrew-mac-os-x.html
    2. 目的為紀錄便於日後查閱翻找
    3. 文章內容僅供參考
用過 Ubuntu 或其他 Linux 的朋友,是否覺得 Mac OS X 少了些方便好用的軟體工具呢?
雖然 Mac 有很棒的 AppStore ,可以線上下載/購買安裝許多好用的軟體,不過有一些在 Linux 系統經常使用的工具,卻找不到類似 apt 或 yum 等好用的套件管理工具,可以幫忙自動安裝和移除。
例如我經常使用 wget 抓取 URL 檔案,或是使用 lftp 連到遠端伺服器上下傳資料,可惜這些在 Linux 系統很普遍使用的工具, OS X 並未內建。
幸好 OS X 和 Linux 同樣具有 Unix 的血緣,所以不少常用的工具,其實都能用原始碼進行編譯及安裝。 OS X 內建的開發環境相當不錯,提供 make 及 gcc 等需要的工具,我們只需要一套簡便的套件管理工具,協助下載原始碼及完成編譯安裝等動作。
Homebrew 是本文要推薦的工具,它使用 Ruby 語言開發,目標是成為一套簡單並且具有彈性的套件管理工具,協助使用者在 OS X 系統上安裝 Unix 程式。
Homebrew - The missing package manager for OS X
在安裝 Homebrew 之前,系統必須裝有 ruby 開發工具,一般的 Mac OS X 系統已內建 ruby ,因使用以下的指令即可安裝 Homebrew 。
$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
安裝好 Homebrew 之後,可以用 brew 指令進行操作,常用的指令格式有四種。
更新套件清單
$ brew update
搜尋某項軟體
$ brew search 軟體名稱
安裝新軟體
$ brew install 軟體名稱
移除軟體
$ brew uninstall 軟體名稱
例如要安裝 wget ,可以輸入:
$ brew install wget
就可以看到 Homebrew 自動幫忙下載原始碼壓縮檔,解壓縮,執行 ./configure 設定,編譯(make)及安裝(make install),這真的太方便了。
如果遇到需要的軟體,已經無法下載檔案,該怎麼辦呢?
以 lftp 來說,現在就會遇到這種情況,所以會顯示以下錯誤:
curl: (22) The requested URL returned error: 404
這真是太悲情了!
幸好 Homebrew 的設定真是超級簡單!超級EASY!只要修改一下設定檔就可以解決。
首先用瀏覽器打開 http://ftp.yars.free.net/pub/source/lftp/ 網址,會發現問題的原因在於,檔案名稱 lftp-4.2.2.tar.bz2 已經不存在,當然無法下載;因為 lftp 已經發佈新版本,檔名是 lftp-4.3.1.tar.bz2 。
Homebrew 針對每一種軟體套件,都有獨立的個別設定檔,稱為 FORMULA ,這些設定檔位於 /usr/local/Library/Formula 路徑下。
以 lftp 工具來說,設定檔就是 /usr/local/Library/Formula/lftp.rb ,用文字編輯器打開,可以發現設定檔其實就是 Ruby 程式。以前面所提到的問題,只需要修改兩個地方,請參考以下紅色字體標示的部份。
require 'formula'

class Lftp < Formula

    url 'http://ftp.yars.free.net/pub/source/lftp/lftp-4.3.1.tar.bz2'

    homepage 'http://lftp.yar.ru/'
    md5 'ea45acfb47b5590d4675c50dc0c6e13c'
修改存檔後,再次執行 brew install lftp ,即可完成安裝。
由於 Homebrew 是免費的開放源碼軟體,諸如此類的問題未來還會不斷發生,但是它至今仍有不斷在更新及維護,只要有更多人參與使用或開發,就會協助 Homebrew 變得更好。
您可以利用以下的管道,取得更多 Homebrew 的消息,並參與討論及提供建議。
  • IRC (irc://irc.freenode.net/#machomebrew)
  • Mailing List (homebrew@librelist.com)
  • Twitter (http://twitter.com/machomebrew)
  • GitHub (http://github.com/mxcl/homebrew)

延伸閱讀

Install Homebrew — The missing package manager for OS X

注意事項:
    1. 需要使用到 ruby 請確保安裝 homebrew 前已裝好
    2. 內容主要目的為方便查閱,內容整理自官網及手冊文件中
    3. 文章僅供參考

Descreption

Homebrew is the easiest and most flexible way to install the UNIX tools Apple didn't include with OS X.

Install

$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

version

$ brew -v

help

Example usage:
  brew [info | home | options ] [FORMULA...]
  brew install FORMULA...
  brew uninstall FORMULA...
  brew search [foo]
  brew list [FORMULA...]
  brew update
  brew upgrade [--all | FORMULA...]
  brew pin/unpin [FORMULA...]

Troubleshooting:
  brew doctor
  brew install -vd FORMULA
  brew [--env | config]

Brewing:
  brew create [URL [--no-fetch]]
  brew edit [FORMULA...]
  open https://github.com/Homebrew/homebrew/blob/master/share/doc/homebrew/Formula-Cookbook.md

Further help:
  man brew
  brew home

Links

2015年9月18日 星期五

Learning HBase Shell

注意事項:
    1. 文章內容須先將 Apache HBase 建置完成。
    2. 文章中一行表示一個欄位,一筆則是由多行組成的一個 Row。
    3. 本文並不包含所有旨令內容,主要為常用指令。
    4. 僅供參考使用。

目錄

Getting Start

  • 進入 HBase 指令模式
    $ hbase shell

List

  • 取得 資料表 列表
    hbase> list

Create

  • 新增 資料表
    Here is some for this command:
    Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute.
    Examples:
    Create a table with namespace=ns1 and table qualifier=t1
    • hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}
    Create a table with namespace=default and table qualifier=t1
    • hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
    • hbase> create 't1', 'f1', 'f2', 'f3'
    • hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
    • hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
    Table configuration options can be put at the end.
    Examples:
    • hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
    • hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
    • hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
    • hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
    • hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
    • hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
    You can also keep around a reference to the created table:
    • hbase> t1 = create 't1', 'f1'
    Which gives you a reference to the table named 't1', on which you can then call methods.
  • Training
    hbase> create 'test', 'cf'

Put

  • 新增一行資料
    Here is some help for this command: Put a cell 'value' at specified table/row/column and optionally timestamp coordinates. To put a cell value into table 'ns1:t1' or 't1' at row 'r1' under column 'c1' marked with the time 'ts1', do:
    • hbase> put 'ns1:t1', 'r1', 'c1', 'value'
    • hbase> put 't1', 'r1', 'c1', 'value'
    • hbase> put 't1', 'r1', 'c1', 'value', ts1
    • hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}}
    • hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
    • hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'}
    The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be:
    • hbase> t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
  • Training
    hbase> put 'test','row0','cf:string','字串'
    hbase> put 'test','row0','cf:boolean',"\x01"
    hbase> put 'test','row0','cf:short',"\x00\x01"
    hbase> put 'test','row0','cf:int',"\x00\x00\x00\x01"
    hbase> put 'test','row0','cf:long',"\x00\x00\x00\x00\x00\x00\x00\x01"
    hbase> put 'test','row0','cf:float',"?\x80\x00\x00"
    hbase> put 'test','row0','cf:double',"?\xF0\x00\x00\x00\x00\x00\x00"
    hbase> put 'test','row1','cf:name','Cookie'
    hbase> put 'test','row1','cf:phone','0999123456'
    hbase> put 'test','row2','cf:name','Tom'
    hbase> put 'test','row3','cf:name','Mary'

Get

  • 取得一筆資料
    Here is some help for this command: Get row or cell contents; pass table name, row, and optionally a dictionary of column(s), timestamp, timerange and versions. Examples:
    • hbase> get 'ns1:t1', 'r1'
    • hbase> get 't1', 'r1'
    • hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
    • hbase> get 't1', 'r1', {COLUMN => 'c1'}
    • hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
    • hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
    • hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
    • hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
    • hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
    • hbase> get 't1', 'r1', 'c1'
    • hbase> get 't1', 'r1', 'c1', 'c2'
    • hbase> get 't1', 'r1', ['c1', 'c2']
    • hbsae> get 't1','r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
    • hbsae> get 't1','r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
    Besides the default 'toStringBinary' format, 'get' also supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the get specification. The FORMATTER can be stipulated:
    1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
    2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
    Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
    • hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt', 'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
    Note that you can specify a FORMATTER by column only (cf:qualifer). You cannot specify a FORMATTER for all columns of a column family.
    The same commands also can be run on a reference to a table (obtained via gettable or createtable). Suppose you had a reference t to table 't1', the corresponding commands would be:
    • hbase> t.get 'r1'
    • hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
    • hbase> t.get 'r1', {COLUMN => 'c1'}
    • hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
    • hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
    • hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
    • hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
    • hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
    • hbase> t.get 'r1', 'c1'
    • hbase> t.get 'r1', 'c1', 'c2'
    • hbase> t.get 'r1', ['c1', 'c2']
  • Training
    hbase> get 'test','row0'
    hbase> get 'test','row0',['cf:string','cf:boolean','cf:float']
    hbase> get 'test','row0', ['cf:string:toString','cf:boolean:toBoolean','cf:int:toInt','cf:float:toFloat']

Scan

注意事項:
    * Scan Filter 常用包含:
        1. RowFilter
        2. SingleColumnValueFilter
        3. ValueFilter
        4. PrefixFilter
    * FILTER ByteArrayComparable 常用包含:
        1. binary
        2. substring
        3. regexstring
        4. binaryprefix 
  • 掃描 Table(查詢多筆資料)
    Here is some help for this command: Scan a table; pass table name and optionally a dictionary of scanner specifications. Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH, or COLUMNS, CACHE or RAW, VERSIONS
    If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in 'col_family:'.
    The filter can be specified in two ways:
    1. Using a filterString - more information on this is available in the Filter Language document attached to the HBASE-4176 JIRA
    2. Using the entire package name of the filter.
    Some examples:
    • hbase> scan 'hbase:meta'
    • hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
    • hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
    • hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
    • hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
    • hbase> scan 't1', {REVERSED => true}
    • hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}
    • hbase> scan 't1', {FILTER = org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)} For setting the Operation Attributes
    • hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
    • hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']} For experts, there is an additional option -- CACHE_BLOCKS -- which switches block caching for the scanner on (true) or off (false). By default it is enabled.
    Examples:
    • hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
    Also for experts, there is an advanced option -- RAW -- which instructs the scanner to return all cells (including delete markers and uncollected deleted cells). This option cannot be combined with requesting specific COLUMNS. Disabled by default. 
    Example:
    • hbase> scan 't1', {RAW => true, VERSIONS => 10}
    Besides the default 'toStringBinary' format, 'scan' supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the scan specification. The FORMATTER can be stipulated:
    1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
    2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
    Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: * hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt', 'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
    Note that you can specify a FORMATTER by column only (cf:qualifer). You cannot specify a FORMATTER for all columns of a column family.
    Scan can also be used directly from a table, by first getting a reference to a table, like such:
    • hbase> t = get_table 't'
    • hbase> t.scan
    Note in the above situation, you can still provide all the filtering, columns, options, etc as described above.
  • Training
    hbase> scan 'test'
    hbase> scan 'test', {STARTROW => 'row1', STOPROW => 'row2~'}
    hbase> scan 'test', {COLUMNS => 'cf:name'}
    hbase> scan 'test', {COLUMNS => ['cf:string:toString','cf:short:toShort','cf:long:toLong']}
    hbase> scan 'test', {FILTER => "ValueFilter(=,'binary:Cookie')"}
    hbase> scan 'test', {FILTER => "SingleColumnValueFilter('cf','name',=,'substring:o')"}
    hbase> scan 'test', {FILTER => "RowFilter(=,'regexstring:[03]$')"}

Delete

  • 刪除一行資料
    Here is some help for this command: Put a delete cell value at specified table/row/column and optionally timestamp coordinates. Deletes must match the deleted cell's coordinates exactly. When scanning, a delete cell suppresses older versions. To delete a cell from 't1' at row 'r1' under column 'c1' marked with the time 'ts1', do:
    • hbase> delete 'ns1:t1', 'r1', 'c1', ts1
    • hbase> delete 't1', 'r1', 'c1', ts1
    • hbase> delete 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}
    The same command can also be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be:
    • hbase> t.delete 'r1', 'c1', ts1
    • hbase> t.delete 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}
  • Training
    hbase> delete 'test', 'row1', 'cf:phone'

Delete All

  • 刪除一筆資料
    Here is some help for this command:
    Delete all cells in a given row; pass a table name, row, and optionally a column and timestamp. 
    Examples:
    • hbase> deleteall 'ns1:t1', 'r1'
    • hbase> deleteall 't1', 'r1'
    • hbase> deleteall 't1', 'r1', 'c1'
    • hbase> deleteall 't1', 'r1', 'c1', ts1
    • hbase> deleteall 't1', 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}
    The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be:
    • hbase> t.deleteall 'r1'
    • hbase> t.deleteall 'r1', 'c1'
    • hbase> t.deleteall 'r1', 'c1', ts1
    • hbase> t.deleteall 'r1', 'c1', ts1, {VISIBILITY=>'PRIVATE|SECRET'}
  • Training
    hbase> deleteall 'test', 'row2'

Disable

  • 關閉 資料表
    Here is some help for this command:
    Start disable of named table:
    • hbase> disable 't1'
    • hbase> disable 'ns1:t1'
  • Training
    hbase> disable 'test'

Enable

  • 開啟 資料表
    Here is some help for this command:
    Start enable of named table:
    • hbase> enable 't1'
    • hbase> enable 'ns1:t1'
  • Training
    hbase> ensable 'test'

Truncate

注意事項:效果完全等同於 disable + drop + create 等指令。
  • 清空資料表
    Here is some help for this command:
    Disables, drops and recreates the specified table.
  • Training
    hbase> truncate 'test'

Drop

注意事項:執行 drop 前須確認資料表處於關閉狀態。(須先執行 diable)
  • 刪除 資料表
    Here is some help for this command:
    Drop the named table. Table must first be disabled:
    • hbase> drop 't1'
    • hbase> drop 'ns1:t1'
  • Training
    hbase> drop 'test'