#include <config.h>
Public Member Functions | |
| TaggedLemma (string_index_t lemma, tag_index_t tag) | |
| string_index_t | lemma (void) const |
| Lemma getter. | |
| tag_index_t | tag (void) const |
| Tag getter. | |
Private Attributes | |
| union { | |
| string_index_t _lemma | |
| tag_index_t _bits [sizeof(string_index_t)/sizeof(tag_index_t)] | |
| }; | |
| Lemma and tag are kept on the same address Lemma resides in 24 LSB bits. | |
Static Private Attributes | |
| static const size_t | _last_bit_index = sizeof(string_index_t) / sizeof(tag_index_t) - 1 |
| static const string_index_t | _lemma_bits_mask = std::numeric_limits<string_index_t>::max() >> (sizeof(tag_index_t) * constants::bits_per_char) |
It's quite likely that when processing large text corpora, the default schema (24 bits for lemma, 8 bits for tag) will require change - here's the only place that needs to be altered.
| ace::TaggedLemma::TaggedLemma | ( | string_index_t | lemma, | |
| tag_index_t | tag | |||
| ) | [inline] |
| lemma | Lemma index. | |
| tag | Tag index. |
| string_index_t ace::TaggedLemma::lemma | ( | void | ) | const [inline] |
Lemma getter.
| tag_index_t ace::TaggedLemma::tag | ( | void | ) | const [inline] |
Tag getter.
| tag_index_t ace::TaggedLemma::_bits[sizeof(string_index_t)/sizeof(tag_index_t)] |
union { ... } [private] |
Lemma and tag are kept on the same address Lemma resides in 24 LSB bits.
Tag ressides in: 8 MSB bits.
const size_t ace::TaggedLemma::_last_bit_index = sizeof(string_index_t) / sizeof(tag_index_t) - 1 [static, private] |
const string_index_t ace::TaggedLemma::_lemma_bits_mask = std::numeric_limits<string_index_t>::max() >> (sizeof(tag_index_t) * constants::bits_per_char) [static, private] |
1.5.6