注意事項:
1. 文章內容須先將 Apache HBase 建置完成。
2. 文章中一行表示一個欄位,一筆則是由多行組成的一個 Row。
3. 本文並不包含所有旨令內容,主要為常用指令。
4. 僅供參考使用。
目錄
Getting Start
進入 HBase 指令模式
$ hbase shell
List
Create
Put
新增一行資料
Here is some help for this command: Put a cell 'value' at specified table/row/column and optionally timestamp coordinates. To put a cell value into table 'ns1:t1' or 't1' at row 'r1' under column 'c1' marked with the time 'ts1', do:
- hbase> put 'ns1:t1', 'r1', 'c1', 'value'
- hbase> put 't1', 'r1', 'c1', 'value'
- hbase> put 't1', 'r1', 'c1', 'value', ts1
- hbase> put 't1', 'r1', 'c1', 'value', {ATTRIBUTES=>{'mykey'=>'myvalue'}}
- hbase> put 't1', 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
- hbase> put 't1', 'r1', 'c1', 'value', ts1, {VISIBILITY=>'PRIVATE|SECRET'}
The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be:
- hbase> t.put 'r1', 'c1', 'value', ts1, {ATTRIBUTES=>{'mykey'=>'myvalue'}}
Training
hbase> put 'test','row0','cf:string','字串'
hbase> put 'test','row0','cf:boolean',"\x01"
hbase> put 'test','row0','cf:short',"\x00\x01"
hbase> put 'test','row0','cf:int',"\x00\x00\x00\x01"
hbase> put 'test','row0','cf:long',"\x00\x00\x00\x00\x00\x00\x00\x01"
hbase> put 'test','row0','cf:float',"?\x80\x00\x00"
hbase> put 'test','row0','cf:double',"?\xF0\x00\x00\x00\x00\x00\x00"
hbase> put 'test','row1','cf:name','Cookie'
hbase> put 'test','row1','cf:phone','0999123456'
hbase> put 'test','row2','cf:name','Tom'
hbase> put 'test','row3','cf:name','Mary'
Get
取得一筆資料
Here is some help for this command: Get row or cell contents; pass table name, row, and optionally a dictionary of column(s), timestamp, timerange and versions. Examples:
- hbase> get 'ns1:t1', 'r1'
- hbase> get 't1', 'r1'
- hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
- hbase> get 't1', 'r1', {COLUMN => 'c1'}
- hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
- hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
- hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
- hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
- hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
- hbase> get 't1', 'r1', 'c1'
- hbase> get 't1', 'r1', 'c1', 'c2'
- hbase> get 't1', 'r1', ['c1', 'c2']
- hbsae> get 't1','r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
- hbsae> get 't1','r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
Besides the default 'toStringBinary' format, 'get' also supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the get specification. The FORMATTER can be stipulated:
- either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
- or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
- hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt', 'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifer). You cannot specify a FORMATTER for all columns of a column family.
The same commands also can be run on a reference to a table (obtained via gettable or createtable). Suppose you had a reference t to table 't1', the corresponding commands would be:
- hbase> t.get 'r1'
- hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
- hbase> t.get 'r1', {COLUMN => 'c1'}
- hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
- hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
- hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
- hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
- hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
- hbase> t.get 'r1', 'c1'
- hbase> t.get 'r1', 'c1', 'c2'
- hbase> t.get 'r1', ['c1', 'c2']
Training
hbase> get 'test','row0'
hbase> get 'test','row0',['cf:string','cf:boolean','cf:float']
hbase> get 'test','row0', ['cf:string:toString','cf:boolean:toBoolean','cf:int:toInt','cf:float:toFloat']
Scan
注意事項:
* Scan Filter 常用包含:
1. RowFilter
2. SingleColumnValueFilter
3. ValueFilter
4. PrefixFilter
* FILTER ByteArrayComparable 常用包含:
1. binary
2. substring
3. regexstring
4. binaryprefix
掃描 Table(查詢多筆資料)
Here is some help for this command: Scan a table; pass table name and optionally a dictionary of scanner specifications. Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, TIMESTAMP, MAXLENGTH, or COLUMNS, CACHE or RAW, VERSIONS
If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in 'col_family:'.
The filter can be specified in two ways:
- Using a filterString - more information on this is available in the Filter Language document attached to the HBASE-4176 JIRA
- Using the entire package name of the filter.
Some examples:
- hbase> scan 'hbase:meta'
- hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
- hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
- hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
- hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
- hbase> scan 't1', {REVERSED => true}
- hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))"}
- hbase> scan 't1', {FILTER = org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)} For setting the Operation Attributes
- hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
- hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']} For experts, there is an additional option -- CACHE_BLOCKS -- which switches block caching for the scanner on (true) or off (false). By default it is enabled.
Examples:
- hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
Also for experts, there is an advanced option -- RAW -- which instructs the scanner to return all cells (including delete markers and uncollected deleted cells). This option cannot be combined with requesting specific COLUMNS. Disabled by default.
Example:
- hbase> scan 't1', {RAW => true, VERSIONS => 10}
Besides the default 'toStringBinary' format, 'scan' supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the scan specification. The FORMATTER can be stipulated:
- either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
- or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: * hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt', 'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifer). You cannot specify a FORMATTER for all columns of a column family.
Scan can also be used directly from a table, by first getting a reference to a table, like such:
- hbase> t = get_table 't'
- hbase> t.scan
Note in the above situation, you can still provide all the filtering, columns, options, etc as described above.
Training
hbase> scan 'test'
hbase> scan 'test', {STARTROW => 'row1', STOPROW => 'row2~'}
hbase> scan 'test', {COLUMNS => 'cf:name'}
hbase> scan 'test', {COLUMNS => ['cf:string:toString','cf:short:toShort','cf:long:toLong']}
hbase> scan 'test', {FILTER => "ValueFilter(=,'binary:Cookie')"}
hbase> scan 'test', {FILTER => "SingleColumnValueFilter('cf','name',=,'substring:o')"}
hbase> scan 'test', {FILTER => "RowFilter(=,'regexstring:[03]$')"}
Delete
Delete All
Disable
關閉 資料表
Here is some help for this command:
Start disable of named table:
- hbase> disable 't1'
- hbase> disable 'ns1:t1'
Training
hbase> disable 'test'
Enable
開啟 資料表
Here is some help for this command:
Start enable of named table:
- hbase> enable 't1'
- hbase> enable 'ns1:t1'
Training
hbase> ensable 'test'
Truncate
注意事項:效果完全等同於 disable + drop + create 等指令。
清空資料表
Here is some help for this command:
Disables, drops and recreates the specified table.
Training
hbase> truncate 'test'
Drop
注意事項:執行 drop 前須確認資料表處於關閉狀態。(須先執行 diable)
刪除 資料表
Here is some help for this command:
Drop the named table. Table must first be disabled:
- hbase> drop 't1'
- hbase> drop 'ns1:t1'
Training
hbase> drop 'test'