barry: Your go-to motif accountant  0.0-1
Full enumeration of sample space and fast count of sufficient statistics for binary arrays
barry: Your go-to motif accountant Documentation

C/C++ CI Doxygen docs codecov Integrative Methods of Analysis for Genetic Epidemiology

Barry: your to-go motif accountant

This repository contains a C++ template library that essentially counts sufficient statistics on binary arrays. Its primary goal is to provide a general framework for building discrete exponential-family models. A particular example is Exponential Random Graph Models (ERGMs), but we can use barry to deal with non-square arrays.

Among the key features included in barry, we have:

  • Sparse arrays.
  • User-defined count statistics.
  • User-defined constrain of the support set.
  • Powerset generation of binary arrays.
  • Discrete Exponential Family Models module (DEFMs).
  • Pooled DEFMs.

To use barry, you can either download the entire repository or, since it is header-only, the single header version barry.hpp.

This library was created and maintained by Dr. George G. Vega Yon as part of his doctoral dissertation "Essays on Bioinformatics and Social Network Analysis: Statistical and Computational Methods for Complex Systems."

Examples

Counting statistics in a graph

In the following code we create an array of size 5x5 of class Network (available in the namespace netcounters), add/remove ties, print the graph, and count common statistics used in ERGMs:

#include <iostream>
#include <ostream>
#include "../include/barry.hpp"
typedef std::vector< unsigned int > vuint;
int main() {
// Creating network of size six with five ties
6, 6,
{0, 0, 4, 4, 2, 0, 1},
{1, 2, 0, 2, 4, 0, 1}
);
// How does this looks like?
net.print("Current view");
// Adding extra ties
net += {1, 0};
net(2, 0) = true;
// And removing a couple
net(0, 0) = false;
net -= {1, 1};
net.print("New view");
// Initializing the data. The program deals with freing the memory
net.set_data(new netcounters::NetworkData, true);
// Creating counter object for the network and adding stats to count
// Counting and printing the results
std::vector< double > counts = counter.count_all();
std::cout <<
"Edges : " << counts[0] << std::endl <<
"Transitive triads : " << counts[1] << std::endl <<
"Isolates : " << counts[2] << std::endl <<
"C triads : " << counts[3] << std::endl <<
"Mutuals : " << counts[4] << std::endl;
return 0;
}
Data_Type counter
void counter_mutual(NetCounters< Tnet > *counters)
Number of mutual ties.
Definition: network.hpp:256
void counter_ttriads(NetCounters< Tnet > *counters)
Definition: network.hpp:441
void counter_ctriads(NetCounters< Tnet > *counters)
Definition: network.hpp:610
void counter_isolates(NetCounters< Tnet > *counters)
Number of isolated vertices.
Definition: network.hpp:175
void counter_edges(NetCounters< Tnet > *counters)
Number of edges.
Definition: network.hpp:152
StatsCounter< Tnet, NetCounterData > NetStatsCounter
Definition: network.hpp:98
BArray< double, NetworkData > Network
Definition: network.hpp:82

Compiling this program using g++

g++ -std=c++11 -Wall -pedantic 08-counts.cpp -o counts && ./counts

Yields the following output:

Current view
[ 0,] 1 1 1 . . .
[ 1,] . 1 . . . .
[ 2,] . . . . 1 .
[ 3,] . . . . . .
[ 4,] 1 . 1 . . .
[ 5,] . . . . . .
New view
[ 0,] . 1 1 . . .
[ 1,] 1 . . . . .
[ 2,] 1 . . . 1 .
[ 3,] . . . . . .
[ 4,] 1 . 1 . . .
[ 5,] . . . . . .
Edges : 7
Transitive triads : 3
Isolates : 2
C triads : 1
Mutuals : 3

Features

Efficient memory usage

One of the key features of barry is that it will handle memory efficiently. In the case of pooled-data models, the module for statistical models avoids double-counting support when possible by keeping track of what datasets (networks, for instance) share the same.

Documentation

More information can be found in the Doxygen website here and in the PDF version of the documentation here.

Code of Conduct

Please note that the barry project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.