ace::NGram Class Reference

Class designed to keep single NGram data. More...

#include <ngram.h>

List of all members.

Public Member Functions

 NGram (void)
 Default constructor constructs zero typed N-gram and sets frequency counter to 0 and no context store.
 NGram (const raw_ngram_t &raw_ngram, ngram_type_t ngram_type)
 Raw N-gram constructor.
 NGram (const NGram &source)
 Copy construction of NGrams is allowed.
 NGram (const NGram &source, ngram_type_t ngram_type)
 "Down-cast" copy-constructor.
ngram_size_t degree () const
bool is_zero (void) const
size_t size_of (void) const
ngram_type_t type (void) const
Memberbegin (void)
const Memberbegin (void) const
Memberend (void)
const Memberend (void) const
Memberget (ngram_size_t index)
const Memberget (ngram_size_t index) const
frequency_t frequency (void) const
void inc (void)
 Increments frequency counter by 1.
string_index_t lemma (ngram_size_t i) const
tag_index_t tag (ngram_size_t i) const
ngram_size_t parent (ngram_size_t i) const
dependency_index_t dependency (ngram_size_t i) const
void add_to_context (const PartOfContext &part_of_context)
 Adds given part_of_context to N-gram context.
context_tcontext (void)
const context_tcontext (void) const

Static Public Member Functions

static void init (void)
 Initialize all static members which depends on precise value of N.
static ngram_type_t full_ngram_type (void)
static ngram_size_t n (void)
static size_t size_of (ngram_size_t degree)
static ngram_type_t types_count (void)

Private Types

typedef Overflow< const NGram
*, frequency_t
_overflows_map_t
 Type definition of Overflow container used with NGram instances.

Private Member Functions

NGramoperator= (const NGram &source)
 Private (forbidden) operators.
bool _overflow (void) const
 Private methods.
void _type (ngram_type_t ngram_type)
 Sets given type as NGram type.

Private Attributes

union {
   freq_counter_t   _frequency
 NGram frequency.
   ngram_type_t   _frequency_bits [sizeof(freq_counter_t)/sizeof(ngram_type_t)]
 The least item in array is used for storage of NGram type and its MSB is used for overflow mask bit.
}; 
 Private data.

Static Private Attributes

static const freq_counter_t _max_frequency = std::numeric_limits<freq_counter_t>::max() >> (sizeof(ngram_type_t)*constants::bits_per_char)
 Constant statics first.
static const ngram_type_t _overflow_flag_mask = static_cast<ngram_type_t>(1 << (sizeof(ngram_type_t)*constants::bits_per_char - 1))
 Mask of overflow flag bit.
static const size_t _last_bit_index = sizeof(freq_counter_t) / sizeof(ngram_type_t) - 1
 Index of last bit(s) within the frequency-type union (its the index of last item in the array).
static const ngram_type_t _type_mask = ff<ngram_type_t>() >> 1
 Mask of type bit(s).
static bool _context_tracing_on = false
 N-dependent and other non-const statics then.
static ngram_type_t _full_ngram_type = 0
 Full N-gram type value.
static std::vector< ngram_size_t_members_count
 Conversion table: N-gram type -> members count.
static std::vector< size_t > _ngrams_memory_sizes
 Conversion table: N-gram degree -> required memory size.
static _overflows_map_t _overflows
 Mapping of overflows.
static ngram_type_t _types_count = 0
 Number of N-grams types (is equal to 2 to power of N).

Friends

class NGramToken
 NGramToken can access our private stuff.

Classes

class  Member
 Single N-gram member representation. More...


Detailed Description

Class designed to keep single NGram data.

Each NGram has its type and frequency counter (stored in same place by using union). It has also from 0 to N members, but their are not stored within NGram (more precisely they are stored just behind him). Access to NGram members is provided by incrementing `this` pointer and casting him to pointer to NGram::Member. The precise number of members of certain NGram is determined by NGram type.


Member Typedef Documentation

Type definition of Overflow container used with NGram instances.


Constructor & Destructor Documentation

ace::NGram::NGram ( void   )  [inline]

Default constructor constructs zero typed N-gram and sets frequency counter to 0 and no context store.

Due its simplicity is declared as public, there's no difference between context-free and contexted zero typed N-grams.

ace::NGram::NGram ( const raw_ngram_t raw_ngram,
ngram_type_t  ngram_type 
)

Raw N-gram constructor.

Contructs N-gram of given type. Sets frequency counter to 1.

Parameters:
raw_ngram Source raw N-gram.
ngram_type Type of N-gram to be created (determines which members of `raw_ngram` are copied).

ace::NGram::NGram ( const NGram source  ) 

Copy construction of NGrams is allowed.

Parameters:
source NGram to be copied.

ace::NGram::NGram ( const NGram source,
ngram_type_t  ngram_type 
)

"Down-cast" copy-constructor.

Frequency is copied from source.

Parameters:
source N-gram (fully typed) which should be downcasted.
ngram_type Type of N-gram to be created (determines which members of `source` are copied).


Member Function Documentation

NGram& ace::NGram::operator= ( const NGram source  )  [private]

Private (forbidden) operators.

bool ace::NGram::_overflow ( void   )  const [inline, private]

Private methods.

Returns:
True, if NGram frequency counter overflowen.

void ace::NGram::_type ( ngram_type_t  ngram_type  )  [private]

Sets given type as NGram type.

Note: we expect only 0 bytes in place of type bits (so OR can be used for new type assignment).

Parameters:
ngram_type New NGram type.

void ace::NGram::init ( void   )  [static]

Initialize all static members which depends on precise value of N.

ngram_size_t ace::NGram::degree (  )  const [inline]

Returns:
Number of members (degree) of N-gram.

static ngram_type_t ace::NGram::full_ngram_type ( void   )  [inline, static]

Returns:
Full N-gram type.

bool ace::NGram::is_zero ( void   )  const [inline]

Returns:
True, if N-gram is zero-typed.

static ngram_size_t ace::NGram::n ( void   )  [inline, static]

Returns:
The N number.

static size_t ace::NGram::size_of ( ngram_size_t  degree  )  [inline, static]

Returns:
Size of occupied memory by given N-gram type.

size_t ace::NGram::size_of ( void   )  const [inline]

Returns:
Size of occupied memory by N-gram.

ngram_type_t ace::NGram::type ( void   )  const [inline]

Returns:
N-gram type.

static ngram_type_t ace::NGram::types_count ( void   )  [inline, static]

Returns:
Number of all N-gram types (which is N to power of 2).

Member* ace::NGram::begin ( void   )  [inline]

Returns:
Pointer to the instance of first N-gram Member.

const Member* ace::NGram::begin ( void   )  const [inline]

Returns:
Pointer to the const instance of first N-gram Member.

Member* ace::NGram::end ( void   )  [inline]

Returns:
Pointer to the memory behind the last N-gram Member.

const Member* ace::NGram::end ( void   )  const [inline]

Returns:
Pointer to the const memory behind the last N-gram Member.

Member* ace::NGram::get ( ngram_size_t  index  )  [inline]

Returns:
Pointer to the index-th instance of N-gram Member(s).

const Member* ace::NGram::get ( ngram_size_t  index  )  const [inline]

Returns:
Pointer to the index-th const instance of N-gram Member(s).

frequency_t ace::NGram::frequency ( void   )  const [inline]

Returns:
N-gram frequency.

void ace::NGram::inc ( void   ) 

Increments frequency counter by 1.

string_index_t ace::NGram::lemma ( ngram_size_t  i  )  const [inline]

Parameters:
i Member index (which goes from 0 to N-1).
Returns:
Lemma (index) of i-th N-gram Member.

tag_index_t ace::NGram::tag ( ngram_size_t  i  )  const [inline]

Parameters:
i Member index (which goes from 0 to N-1).
Returns:
Morphologic tag (index) of i-th N-gram Member.

ngram_size_t ace::NGram::parent ( ngram_size_t  i  )  const [inline]

Parameters:
i Member index (which goes from 0 to N-1).
Returns:
Parent member index of i-th N-gram Member.

dependency_index_t ace::NGram::dependency ( ngram_size_t  i  )  const [inline]

Parameters:
i Member index (which goes from 0 to N-1).
Returns:
Dependency (index) of i-th N-gram Member.

void ace::NGram::add_to_context ( const PartOfContext part_of_context  ) 

Adds given part_of_context to N-gram context.

Parameters:
part_of_context To be added to the N-gram context.

context_t* ace::NGram::context ( void   )  [inline]

Returns:
Pointer to the context container.

const context_t* ace::NGram::context ( void   )  const [inline]

Returns:
Pointer to the constant context container.


Friends And Related Function Documentation

friend class NGramToken [friend]

NGramToken can access our private stuff.


Member Data Documentation

const freq_counter_t ace::NGram::_max_frequency = std::numeric_limits<freq_counter_t>::max() >> (sizeof(ngram_type_t)*constants::bits_per_char) [static, private]

Constant statics first.

Max allowed frequency. Acts also as a mask for frequency retrieving.

const ngram_type_t ace::NGram::_overflow_flag_mask = static_cast<ngram_type_t>(1 << (sizeof(ngram_type_t)*constants::bits_per_char - 1)) [static, private]

Mask of overflow flag bit.

const size_t ace::NGram::_last_bit_index = sizeof(freq_counter_t) / sizeof(ngram_type_t) - 1 [static, private]

Index of last bit(s) within the frequency-type union (its the index of last item in the array).

const ngram_type_t ace::NGram::_type_mask = ff<ngram_type_t>() >> 1 [static, private]

Mask of type bit(s).

bool ace::NGram::_context_tracing_on = false [static, private]

N-dependent and other non-const statics then.

Do we run in contextless or contextfull mode?

Full N-gram type value.

std::vector< ngram_size_t > ace::NGram::_members_count [static, private]

Conversion table: N-gram type -> members count.

std::vector< size_t > ace::NGram::_ngrams_memory_sizes [static, private]

Conversion table: N-gram degree -> required memory size.

Mapping of overflows.

ngram_type_t ace::NGram::_types_count = 0 [static, private]

Number of N-grams types (is equal to 2 to power of N).

NGram frequency.

The least item in array is used for storage of NGram type and its MSB is used for overflow mask bit.

Note: We expect little-endian ordering!

union { ... } [private]

Private data.

Item holds data for frequency, type and overflow flag altogether. The data are distributed as follows: (MSB) |O|TTTTTTT|F..................F| (LSB) O - oveflow flag bit (MSB). TTTTTTT - type (7) bits. F...F - frequency bits (all remaining part). To retrieve appropriate data bitmasks are used.


The documentation for this class was generated from the following files:

Generated on Wed Aug 6 23:25:50 2008 for PACE by  doxygen 1.5.6