class StringBuffer

Overview

A StringBuffer is intended to be used as buffer for strings or binary data. Crystal does not permit the String class to be subclassed, so the StringBuffer is implemented as a very light wrapper around a String. It works by a String with a specific maximum capacity, which should be larger than the largest piece of data that will be stored within it. The string's contents will be changed by mutating the string's memory buffer and header information directly, avoiding extra copying of data.

buffer = StringBuffer.new(256)
buffer.mutate("Hello, world!")
puts buffer # => "Hello, world!"

buffer << "This is a test."
puts buffer # => "This is a test."

The StringBuffer will truncate any data that exceeds the capacity of its underlying string, so there is no risk of the string's memory buffer being overrun.

Defined in:

base58/extensions/string.cr

Constructors

Instance Method Summary

Macro Summary

Constructor Detail

def self.new(string : String) #

Initialize the buffer with a String.

buffer = StringBuffer.new("0x00" * 256)

def self.new(slice : Slice(UInt8)) #

Initialize the buffer with a Slice(UInt8).

buffer = StringBuffer.new(Slice(UInt8).new(256, 0))

def self.new(capacity : Int = 256) #

Initialize the buffer with a specific capacity, but do nothing to clear the underlying memory buffer. Until data is assigned to the buffer, it's contents will be undefined and meaningless.

buffer = StringBuffer.new(256)
pp buffer.buffer.to_slice[0, 8] # => Bytes[0, 0, 27, 28, 33, 0, 0, 0]

Instance Method Detail

def <<(val) #

Shorthand, convenience method for #mutate.


def buffer : String #

Returns the underlying String.


def capacity : Int32 #

def header #

Retuns the object header of the underlying String. Like #to_unsafe, this method is exposed because other internals use it, but it's unlikley that you will want or need to use it directly.


def mutate(val) #

This method takes a memory buffer referenced by a Pointer(UInt8), and encodes it into an existing String. It may be useful to understand how this works, however, so pull up a chair and a drink, dear reader, and we'll have a short chat.

A String, in Crystal, is represented by four in-memory pieces of data. These are a Type ID, a byte size, a character size, and the bytes of data that comprise the actual string.

| Type ID | Byte Size | Character Size | Bytes |

The first three of those, taken together, represent the String header. The String class holds an undocumented constant, String::HEADER_SIZE, which is the size of the header, in bytes.

When a String is initially created, a memory buffer of HEADER_SIZE + capacity is allocated, where #capacity is the maximum number of bytes that the String can hold.

The HEADER_SIZE is used as an offset into this buffer to point to the part of the buffer which will hold the bytes of string data, and that data is inserted into memory starting with that offset.

To finalize the String's memory buffer, a header is written into the first HEADER_SIZE bytes of the buffer, which contains the Type ID, the byte size, and the character size of the string.

Because this is just data in memory, it is possible to access it directly, and manipulate it. Also, a String still works just fine if the allocated memory for it is larger than what is actually used to store the header plus the string data. This provides an opportunity to directly mutate a String.

If the data after the HEADER_SIZE offset is changed, the string is changed. However, if the amount of data changes, the header must also be updated to reflect the new size of the string. That header is just bytes, though, so it can be rewritten.

header = string.as({Int32, Int32, Int32}*)                          # Effectively extracts the header from the String as a Pointer({Int32, Int32, Int32}).
header.value = {String::TYPE_ID, new_byte_size, new_character_size} # Rewrites the header. MUST NOT exceed original byte_size.

As mentioned in the comments above, the new data that is inserted into the buffer must not exceed it's original size. If it does, at best, something else might come along later and stomp on that data, but more likely, the program will crash:

Invalid memory access (signal 11) at address 0x168b0ae

This limitation is because the GC.realloc call, which can be used to resize an allocation to a smaller or a larger size, does not guarantee that, in the case of a larger allocation, the allocation will remain in the same location. If the memory does not have enough free space to increase the size of the allocation, realloc will copy the contents of the old buffer to the new location, and then free the old location. If this happens, however, your program's other code won't realize that the string is now in a different location, and when an effort to access it happens, it will access the old location, which will no longer be valid, likely resulting in your program crashing.

So... don't do that. Within the limitation regarding not exceeding the original size of the String, however, it appears to work flawlessly.


def to_s(io : IO) #

Outputs the contents of the underlying String to the given IO.


def to_unsafe #

Returns a Pointer(UInt8) to the underlying String's data. This method is exposed because other internals use it, but it's unlikley that you will want or need to use it directly.


Macro Detail

macro method_missing(call) #