← Back to tools

Colibri Core

Count and extract n-grams and patterns from large corpus data efficiently

Text Processing linuxmacos C++ GPL-3.0

Description

Colibri Core quickly and efficiently counts and extracts patterns (n-grams and more) from large corpus data, extracts statistics on patterns, and computes relations between extracted patterns.

AI Summary

Fast n-gram and pattern extraction from large corpora with statistical analysis

Capabilities

  • + Extract n-grams from corpora
  • + Count pattern frequencies
  • + Compute pattern statistics
  • + Find pattern relations
  • + Handle large corpus data

Use When

  • When you need n-gram extraction from large corpora
  • When computing pattern statistics on text

Avoid When

  • x When working with small text files

Related Tools

View AGENTS.md for Colibri Core