proto/Source/extractor.cpp File Reference

Definition (implementation) of N-grams extraction module interface. More...

#include <algorithm>
#include <utility>
#include <vector>
#include "buffer.h"
#include "config.h"
#include "extractor.h"
#include "ngram.h"
#include "notifier.h"
#include "ntree.h"
#include "parser.h"
#include "persistent.h"
#include "utils.h"
#include "word.h"

Namespaces

namespace  ace

Functions

void ace::_build_tree (const words_range_t &sentence, ntree_t &nodes)
 Builds a ntree based on grammar dependency tree of given sentence.
words_store_t::const_iterator ace::_extract_ngram (words_store_t::const_iterator sentence_start, const subtree_t &subtree, raw_ngram_t &raw_ngram)
 Extracts raw N-gram related to passed subtree.
NGram * ace::_store_ngram (const raw_ngram_t &raw_ngram, ngram_type_t type)
 Stores instance of given N-gram.
void ace::_store_wide_context (NGram *ngram, const raw_ngram_t &raw_ngram, words_range_t context_range)
 Stores wide context for given N-gram.
void ace::_store_raw_ngram (const raw_ngram_t &raw_ngram, const Buffer &buffer, NamedDataFileStats &stats, words_store_t::const_iterator head)
 Stores all *-types for given raw N-gram and updates related file stats counter.
size_t ace::extract (std::ifstream &input_file, NamedDataFileStats &stats)
 Procedure extracts N-grams from given input datafile and counts file stats.


Detailed Description

Definition (implementation) of N-grams extraction module interface.

See extractor.h for more info about the interface.

(C) Ceslav Przywara 2008, MFF UK Prague


Generated on Wed Aug 6 23:25:49 2008 for PACE by  doxygen 1.5.6