Sunday, March 22, 2009

Java Buffer

A buffer is an object, used to write some primitive type data into or read from. A buffer provides structured access to the data while keeping track for the reading and writing processes. Buffers allow I/O operations on blocks of data instead of working with them byte by byte (stream-oriented) which speeds up the I/O operations.

To understand buffers in depth we need to take a tour to the buffer internals.

Buffer Internals

State Variables

Buffer state variables help in keeping the "internal accounting" for them. With each read/ write operation, buffer's state variable is updated to help buffers manage its resources and help us perform I/O operations in blocks. Buffers has 3 state variables to track its state and the data it holds-

Position – keeps track of how much data was written or read from the buffer i.e, where should the next set of data block we added to the buffer or read from.

Limit – keeps track of how much data is left in the buffer to read from or how much space is left in the buffer to write data into

Capacity – specifies the max amount of data that the buffer can hold.

This brings us to the equation,

position ≤ limit ≤ capacity where none of the state variables can be negative.

Now let us try to visualize these variables. Assuming the capacity of our buffer is 16 bytes shown by dashes below,

State: Empty

position =0                                                                                                                                                       limit, capacity = 16
down arrow____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____down arrow

State: First write of 8 bytes

                                                                                    position = 8                                                                             limit, capacity = 16
__1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ down arrow____ ____ ____ ____ ____ ____ ____ ____down arrow

State: Second write of 4 bytes

                                                                                                                              position = 12        limit, capacity = 16
__1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ down arrow ____ ____ ____ ____down arrow

Now let us flip the buffer to read the data from, flip(), this sets the limit to the current position and resets position to 0.

State: flip()

position = 0                                                                                                           limit = 12                            capacity = 16
down arrow__1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ down arrow____ ____ ____ ____down arrow

The buffer is now ready to be read the data from,

State: Read 8 bytes

                                                                           position = 8                            limit = 12                                  capacity = 16
__1_ __1_ __1_ __1_ __1_ __1_ __1_ __1_ down arrow__1_ __1_ __1_ __1_ down arrow ____ ____ ____ ____down arrow

The next read statement can read maximum 4 more bytes from our buffer due to the limit set to 12.

State: Read 4 bytes

                                                                                                              position, limit = 12                       capacity = 16
__1_ __1_ __1_ __1_ __1_ __1_ __1_ __1___1_ __1_ __1_ __1_ down arrow___ ___ ___ ___down arrow

And finally we clear up our buffer before using it further, clear(), this sets the position to 0 and the limit equal to the buffer capacity.

State: clear()

position =0                                                                                                                             limit, capacity = 16
down arrow____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ ____down arrow

Accessor Methods

Get (ByteBuffer)
  1. byte get(); - returns single byte.
  2. ByteBuffer get( byte dest[] ); - reads a group of bytes into the array dest
  3. ByteBuffer get( byte dest[], int offset, int length ); - reads a group of bytes into the array dest
  4. byte get( int index ); - returns a byte of data from the position specified by index

The methods from 1-3 respect the buffer state variables whereas, method 4 does not. So the 4th method ignores the position and limit state variable of the buffer and does not change their values either. Method 4 is referred as an absolute method while other methods are relative. Methods 2 and 3 just return this object on which they were called which allows chaining of the methods when needed.

buffer.get(data).flip();

Put (ByteBuffer)

1. ByteBuffer put( byte b ); - puts one byte in the buffer

2. ByteBuffer put( byte src[] ); - puts an array of bytes in the buffer

3. ByteBuffer put( byte src[], int offset, int length ); - puts an array of bytes in the buffer

4. ByteBuffer put( ByteBuffer src ); - copies data from source buffer into this buffer

5. ByteBuffer put( int index, byte b ); - puts data byte into the position specified by index

Here the method 5 is absolute and all others are relative.

The methods discussed above are all related to ByteBuffer class. Other buffer types have equivalent get() and put() methods dealing with the corresponding primitive type they handle.

ByteBuffer class also has methods to get or put data of specific primitive type both in absolute and relative form.

ByteBuffer Quick Facts

  1. Buffer allocation automatically empties the ByteBuffer and resets the state variables.
  2. duplicate and slice methods perform shallow copy of the original ByteBuffer. So anything you do on the returned buffer will affect the original.

Other handy methods

Creating buffers: allocate() and wrap()

Buffers can be created by allocating space for it using method allocate() or by wrapping existing array into a buffer using method wrap().

ByteBuffer buffer = ByteBuffer.allocate(1024);

Allocates 1024 bytes of space for the object buffer.

You can also wrap an array of primitive type into a corresponding buffer.

Byte arr[] = new byte[1024];

Bytebuffer buffer = ByteBuffer.wrap(arr);

Both buffer and arr share the same memory space now.
Direct vs. in-direct ByteBuffer Allocations

Direct ByteBuffer space is allocated in the native OS memory, although java does not guarantee the success. Allocation of direct ByteBuffer in memory is costly but it provides faster I/O.

ByteBuffer byte_buff = ByteBuffer.allocateDirect (2000);

There is no allocateDirect method for other primitive buffer types but we can use ByteBuffer view buffers to read the data in other primitive type while still making use of ByteBuffer's allocateDirect underneath.

ByteBuffer byte_buff = ByteBuffer.allocateDirect (2000);
CharBuffer cbuf = buffer.asCharBuffer();

Slicing buffers: slice()

Creates a sub-buffer out of the original buffer it is called upon and both share the same memory space. Slicing a buffer creates a shallow copy.

ByteBuffer origBuffer = ByteBuffer.allocate(16);

origBuffer.position(4);

origBuffer.limit(12);

ByteBuffer slicedBuffer = origBuffer.slice();

Now if we add 4 to each value in the buffer the above buffer can be represented as

position = 0     position(slicedBuffer) = 4                                  limit(slicedBuffer) = 12                      capacity = 16
down arrow__1_ __1_ __1_ __1_ down arrow __5_ __5_ __5_ __5_ __5_ __5_ __5_ __5_ down arrow ____ ____ ____ ____down arrow

This feature allows data abstraction by helping you write functions to work with whole or a slice of buffer data.

Marking the buffer position: mark()

Marks the current position in the buffer such that any subsequent buffer reset() will bring the buffer position to the current mark position instead of setting it to 0.

Rewind Buffer: rewind()

Sets the buffer position to 0 and discards any mark settings

Creating read-only buffers: asReadOnlyBuffer()

ByteBuffer buffer = ByteBuffer.allocate(1024);
ByteBuffer readoonlyBuffer = buffer.asReadOnlyBuffer();

Buffer in Action

Copying data from input stream into buffer and writing the data from the buffer into output stream.


import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class BufferCopy
{
  public static void main(String[] args) throws IOException
  {
    FileInputStream inFile = new FileInputStream(args[0]);
    FileOutputStream outFile = new FileOutputStream(args[1]);
    FileChannel inChannel = inFile.getChannel();
    FileChannel outChannel = outFile.getChannel();

    ByteBuffer buffer = ByteBuffer.allocate(1024*1024);

    for (; inChannel.read(buffer) != -1; buffer.clear())
    {
      buffer.flip();
      while (buffer.hasRemaining())
       outChannel.write(buffer);
    }
    inChannel.close();
    outChannel.close();
  }
}

Converting ByteBuffer to CharBuffer

char[] data = "ByteToCharBuffer".toCharArray();
ByteBuffer bb = ByteBuffer.allocate(data.length * 2);
CharBuffer cb = bb.asCharBuffer();
cb.put(data);
while ((c = cb.getChar()) != 0)
System.out.print(c + " ");


Wrap a char array into a charBuffer

CharBuffer buffer = CharBuffer.allocate(8);
char[] myBuffer = new char[100];
CharBuffer cb = CharBuffer.wrap(myBuffer);


Converting between string and bytes

// Create the encoder and decoder
Charset charset = Charset.forName("ISO-8859-1");
CharsetDecoder decoder = charset.newDecoder();
CharsetEncoder encoder = charset.newEncoder();
try
{
// Convert string to bytes (ISO-LATIN-1) in ByteBuffer
ByteBuffer bbuf = encoder.encode(CharBuffer.wrap("string"));

// Convert bytes from ByteBuffer into CharBuffer and then to a string.
CharBuffer cbuf = decoder.decode(bbuf);
String s = cbuf.toString();
}
catch (CharacterCodingException e) {
}

String and byte conversion using the direct allocation for ByteBuffer

// Create a direct ByteBuffer for channeling the data
ByteBuffer bytebuf = ByteBuffer.allocateDirect(1024);
// Create a non-direct character ByteBuffer
CharBuffer charbuf = CharBuffer.allocate(1024);
// Convert characters in charbuf to bytebuf
encoder.encode(charbuf, bytebuf, false);
// flip bytebuf before reading from it
bytebuf.flip();
// Convert bytes in bytebuf to charbuf
decoder.decode(bytebuf, charbuf, false);
// flip charbuf before reading from it
charbuf.flip();

8 comments:

  1. what's the benefit of using a buffer instead of, say, an array?

    ReplyDelete
    Replies
    1. Buffers have block operations available to them, giving them hardware abstractions that arrays don't typically benefit from. The concept of Buffer-Channel I/O was derived from operating system models to begin with, particularly those of UNIX models, and derive heavily from system I/O operations; so after a Buffer method is called, the action is effectively native code. Put simply, it's much faster.

      Delete
  2. You said:
    Limit – keeps track of how much data is left in the buffer to read from or how much space is left in the buffer to write data into

    really? Because that's not what your pictures show. As you write into the buffer more, the limit doesn't change. So what does this mean? That as more is written into the buffer, there is no change in the amount of "data left in the buffer to write data into". Fine. But how is that different than capacity? It's not. So what's limit as distinct from capacity? You never told us.

    ReplyDelete
  3. @softwarevisualization
    Buffer's limit is the index of the next element that should not be read from or written into in a buffer. If you look at the picture example, while writing into the buffer the limit stays equal to capacity allowing writing into the buffer as much it's capacity is. Once flipped ( to read from the buffer) the limit takes the value of position (=12), though the capacity of the buffer remains the same (=16). A buffer's capacity is always fixed whereas limit value changes.
    Please let me know if you would need further details on this.

    ReplyDelete
  4. very good post, with the images easy to understand.

    ReplyDelete
  5. I liked the diagram. I am currently trying to find out the trivia behind method name "flip", which doesn't sound very intuitive to me.

    ReplyDelete
    Replies
    1. Thanks...
      If you are using the buffer to read from and write into, once you are done with writing into the buffer, to make the buffer ready for read, you need to call flip() on the buffer. flip() gets the buffer ready for reading. If you look at the visual representation above, without the flip(), if we allow reading from the buffer, we will have to set the start position to 0 (coz if we continue from the current position we will read garbage /get exception). Also we will have to set the limit pointer to the last written content (current position) so that we know where the written content finished. flip() method performs these two steps for us.
      Hope this helps.

      Delete
  6. Thanks for the valuable explanation.

    btw, I think there is a mistake in this part:
    Wrap a char array into a charBuffer
    CharBuffer buffer = CharBuffer.allocate(8);
    char[] myBuffer = new char[100];
    CharBuffer cb = CharBuffer.wrap(myBuffer);

    ReplyDelete