#include <ngram.h>
Public Member Functions | |
| NGram (void) | |
| Default constructor constructs zero typed N-gram and sets frequency counter to 0 and no context store. | |
| NGram (const raw_ngram_t &raw_ngram, ngram_type_t ngram_type) | |
| Raw N-gram constructor. | |
| NGram (const NGram &source) | |
| Copy construction of NGrams is allowed. | |
| NGram (const NGram &source, ngram_type_t ngram_type) | |
| "Down-cast" copy-constructor. | |
| ngram_size_t | degree () const |
| bool | is_zero (void) const |
| size_t | size_of (void) const |
| ngram_type_t | type (void) const |
| Member * | begin (void) |
| const Member * | begin (void) const |
| Member * | end (void) |
| const Member * | end (void) const |
| Member * | get (ngram_size_t index) |
| const Member * | get (ngram_size_t index) const |
| frequency_t | frequency (void) const |
| void | inc (void) |
| Increments frequency counter by 1. | |
| string_index_t | lemma (ngram_size_t i) const |
| tag_index_t | tag (ngram_size_t i) const |
| ngram_size_t | parent (ngram_size_t i) const |
| dependency_index_t | dependency (ngram_size_t i) const |
| void | add_to_context (const PartOfContext &part_of_context) |
| Adds given part_of_context to N-gram context. | |
| context_t * | context (void) |
| const context_t * | context (void) const |
Static Public Member Functions | |
| static void | init (void) |
| Initialize all static members which depends on precise value of N. | |
| static ngram_type_t | full_ngram_type (void) |
| static ngram_size_t | n (void) |
| static size_t | size_of (ngram_size_t degree) |
| static ngram_type_t | types_count (void) |
Private Types | |
| typedef Overflow< const NGram *, frequency_t > | _overflows_map_t |
| Type definition of Overflow container used with NGram instances. | |
Private Member Functions | |
| NGram & | operator= (const NGram &source) |
| Private (forbidden) operators. | |
| bool | _overflow (void) const |
| Private methods. | |
| void | _type (ngram_type_t ngram_type) |
| Sets given type as NGram type. | |
Private Attributes | |
| union { | |
| freq_counter_t _frequency | |
| NGram frequency. | |
| ngram_type_t _frequency_bits [sizeof(freq_counter_t)/sizeof(ngram_type_t)] | |
| The least item in array is used for storage of NGram type and its MSB is used for overflow mask bit. | |
| }; | |
| Private data. | |
Static Private Attributes | |
| static const freq_counter_t | _max_frequency = std::numeric_limits<freq_counter_t>::max() >> (sizeof(ngram_type_t)*constants::bits_per_char) |
| Constant statics first. | |
| static const ngram_type_t | _overflow_flag_mask = static_cast<ngram_type_t>(1 << (sizeof(ngram_type_t)*constants::bits_per_char - 1)) |
| Mask of overflow flag bit. | |
| static const size_t | _last_bit_index = sizeof(freq_counter_t) / sizeof(ngram_type_t) - 1 |
| Index of last bit(s) within the frequency-type union (its the index of last item in the array). | |
| static const ngram_type_t | _type_mask = ff<ngram_type_t>() >> 1 |
| Mask of type bit(s). | |
| static bool | _context_tracing_on = false |
| N-dependent and other non-const statics then. | |
| static ngram_type_t | _full_ngram_type = 0 |
| Full N-gram type value. | |
| static std::vector< ngram_size_t > | _members_count |
| Conversion table: N-gram type -> members count. | |
| static std::vector< size_t > | _ngrams_memory_sizes |
| Conversion table: N-gram degree -> required memory size. | |
| static _overflows_map_t | _overflows |
| Mapping of overflows. | |
| static ngram_type_t | _types_count = 0 |
| Number of N-grams types (is equal to 2 to power of N). | |
Friends | |
| class | NGramToken |
| NGramToken can access our private stuff. | |
Classes | |
| class | Member |
| Single N-gram member representation. More... | |
Each NGram has its type and frequency counter (stored in same place by using union). It has also from 0 to N members, but their are not stored within NGram (more precisely they are stored just behind him). Access to NGram members is provided by incrementing `this` pointer and casting him to pointer to NGram::Member. The precise number of members of certain NGram is determined by NGram type.
typedef Overflow<const NGram *, frequency_t> ace::NGram::_overflows_map_t [private] |
| ace::NGram::NGram | ( | void | ) | [inline] |
Default constructor constructs zero typed N-gram and sets frequency counter to 0 and no context store.
Due its simplicity is declared as public, there's no difference between context-free and contexted zero typed N-grams.
| ace::NGram::NGram | ( | const raw_ngram_t & | raw_ngram, | |
| ngram_type_t | ngram_type | |||
| ) |
Raw N-gram constructor.
Contructs N-gram of given type. Sets frequency counter to 1.
| raw_ngram | Source raw N-gram. | |
| ngram_type | Type of N-gram to be created (determines which members of `raw_ngram` are copied). |
| ace::NGram::NGram | ( | const NGram & | source | ) |
| ace::NGram::NGram | ( | const NGram & | source, | |
| ngram_type_t | ngram_type | |||
| ) |
"Down-cast" copy-constructor.
Frequency is copied from source.
| source | N-gram (fully typed) which should be downcasted. | |
| ngram_type | Type of N-gram to be created (determines which members of `source` are copied). |
| bool ace::NGram::_overflow | ( | void | ) | const [inline, private] |
| void ace::NGram::_type | ( | ngram_type_t | ngram_type | ) | [private] |
| void ace::NGram::init | ( | void | ) | [static] |
Initialize all static members which depends on precise value of N.
| ngram_size_t ace::NGram::degree | ( | ) | const [inline] |
| static ngram_type_t ace::NGram::full_ngram_type | ( | void | ) | [inline, static] |
| bool ace::NGram::is_zero | ( | void | ) | const [inline] |
| static ngram_size_t ace::NGram::n | ( | void | ) | [inline, static] |
| static size_t ace::NGram::size_of | ( | ngram_size_t | degree | ) | [inline, static] |
| size_t ace::NGram::size_of | ( | void | ) | const [inline] |
| ngram_type_t ace::NGram::type | ( | void | ) | const [inline] |
| static ngram_type_t ace::NGram::types_count | ( | void | ) | [inline, static] |
| const Member* ace::NGram::begin | ( | void | ) | const [inline] |
| Member* ace::NGram::end | ( | void | ) | [inline] |
| const Member* ace::NGram::end | ( | void | ) | const [inline] |
| Member* ace::NGram::get | ( | ngram_size_t | index | ) | [inline] |
| const Member* ace::NGram::get | ( | ngram_size_t | index | ) | const [inline] |
| frequency_t ace::NGram::frequency | ( | void | ) | const [inline] |
| void ace::NGram::inc | ( | void | ) |
Increments frequency counter by 1.
| string_index_t ace::NGram::lemma | ( | ngram_size_t | i | ) | const [inline] |
| tag_index_t ace::NGram::tag | ( | ngram_size_t | i | ) | const [inline] |
| ngram_size_t ace::NGram::parent | ( | ngram_size_t | i | ) | const [inline] |
| dependency_index_t ace::NGram::dependency | ( | ngram_size_t | i | ) | const [inline] |
| void ace::NGram::add_to_context | ( | const PartOfContext & | part_of_context | ) |
Adds given part_of_context to N-gram context.
| part_of_context | To be added to the N-gram context. |
| context_t* ace::NGram::context | ( | void | ) | [inline] |
| const context_t* ace::NGram::context | ( | void | ) | const [inline] |
friend class NGramToken [friend] |
NGramToken can access our private stuff.
const freq_counter_t ace::NGram::_max_frequency = std::numeric_limits<freq_counter_t>::max() >> (sizeof(ngram_type_t)*constants::bits_per_char) [static, private] |
Constant statics first.
Max allowed frequency. Acts also as a mask for frequency retrieving.
const ngram_type_t ace::NGram::_overflow_flag_mask = static_cast<ngram_type_t>(1 << (sizeof(ngram_type_t)*constants::bits_per_char - 1)) [static, private] |
Mask of overflow flag bit.
const size_t ace::NGram::_last_bit_index = sizeof(freq_counter_t) / sizeof(ngram_type_t) - 1 [static, private] |
Index of last bit(s) within the frequency-type union (its the index of last item in the array).
const ngram_type_t ace::NGram::_type_mask = ff<ngram_type_t>() >> 1 [static, private] |
Mask of type bit(s).
bool ace::NGram::_context_tracing_on = false [static, private] |
N-dependent and other non-const statics then.
Do we run in contextless or contextfull mode?
ngram_type_t ace::NGram::_full_ngram_type = 0 [static, private] |
Full N-gram type value.
std::vector< ngram_size_t > ace::NGram::_members_count [static, private] |
Conversion table: N-gram type -> members count.
std::vector< size_t > ace::NGram::_ngrams_memory_sizes [static, private] |
Conversion table: N-gram degree -> required memory size.
NGram::_overflows_map_t ace::NGram::_overflows [static, private] |
Mapping of overflows.
ngram_type_t ace::NGram::_types_count = 0 [static, private] |
Number of N-grams types (is equal to 2 to power of N).
NGram frequency.
| ngram_type_t ace::NGram::_frequency_bits[sizeof(freq_counter_t)/sizeof(ngram_type_t)] |
The least item in array is used for storage of NGram type and its MSB is used for overflow mask bit.
Note: We expect little-endian ordering!
union { ... } [private] |
Private data.
Item holds data for frequency, type and overflow flag altogether. The data are distributed as follows: (MSB) |O|TTTTTTT|F..................F| (LSB) O - oveflow flag bit (MSB). TTTTTTT - type (7) bits. F...F - frequency bits (all remaining part). To retrieve appropriate data bitmasks are used.
1.5.6