Extra Guide You Through a Clear and Practical Swig Programming Practice

Extra - Guide You through a Clear and Practical SWIG Programming Practice #

Hello, I’m Lu Yusheng, a senior software engineer at Autodesk’s Data Platform and Computing Platform. I am also the author of “Hands-on Deep Neural Networks on Mobile Platforms” and “Distributed Real-Time Processing Systems: Principles, Architecture, and Implementation.” I mainly work on C/C++ and JavaScript development and platform architecture research, and I also have extensive experience with SWIG. I am honored to be invited by Geek Time to share my knowledge with you today. So, let’s talk about SWIG.

We all know that Python is an easy-to-learn and experiment-friendly glue language. Nowadays, many machine learning developers or researchers choose Python as their main programming language. Popular machine learning frameworks also provide support for the Python language as an interface and tool. Therefore, compared to C++, which has a higher learning curve, Python is the preferred programming language for entering the world of machine learning.

However, is the core of machine learning frameworks like TensorFlow or PyTorch written in Python?

Obviously not. There are many reasons for this, but the most significant one is “performance.” The core of machine learning frameworks, which is written in C++, combined with the optimization capabilities of compilers, provides efficiency close to machine code execution. This unique advantage allows C++ to firmly establish its position in the core field of machine learning. The cores of both TensorFlow and PyTorch are developed using C/C++. In particular, the TensorFlow kernel is composed of highly optimized C++ code and CUDA.

Therefore, we can understand that TensorFlow describes models using Python, but the actual computations are executed by high-performance C++ code. And in most cases, the data passed between different operations is not copied back to the Python execution space. Machine learning frameworks ensure computational performance in this way while also considering ease of use.

Therefore, when Python and C++ are used together, the performance bottleneck of Python itself is not that important. It is enough for it to handle the tasks we give it, and for tasks with higher computational requirements, let’s leave it to C++!

Today, we will discuss how to use SWIG to wrap C++ programs for Python. First, I will guide you in writing a Python script to perform a simple machine learning task. Then, we will attempt to rewrite the computationally intensive part as a C++ program and wrap it using SWIG. The end result will be Python delegating the computationally intensive task to C++ for execution.

We will do a simple performance comparison and explain the use of SWIG in the process. At the end of today’s lesson, I will provide you with a learning path as a reference for future improvement.

Now that we have clarified the learning objectives for today, which is to use SWIG to enable Python to call C++ code, the content of our discussion today can actually be seen as a practical guide on SWIG programming. Before diving into this guide, let’s take a brief look at SWIG.

What is SWIG? #

SWIG is a software development tool that connects C/C++ with various high-level programming languages (in this case, we specifically emphasize Python). SWIG supports multiple types of target languages. Commonly supported scripting languages include JavaScript, Perl, PHP, Tcl, Ruby, and Python. Supported high-level programming languages include C#, D, Go, Java (including support for Android), Lua, OCaml, Octave, Scilab, and R.

We typically use SWIG to create high-level interpreted or compiled programming environments and interfaces. It is also commonly used as a testing tool for prototyping programs written in C/C++. A typical use case is parsing and creating C/C++ interfaces to generate glue code for higher-level programming languages such as Python. The recently released version 4.0.0 brings significant improvements and support for C++, including (but not limited to) the following:

Improved STL wrappers for C#, Java, and Ruby.
Added support for STL containers under C++11 for Java, Python, and Ruby.
Improved support for C++11 and C++14 code.
Fixed a series of bugs related to smart pointer shared_ptr in C++.
A series of bug fixes for extreme cases in the C preprocessor.
A series of bug fixes for member function pointer issues.
The minimum supported Python versions are 2.7 and 3.2-3.7.

Implementing PCA Algorithm with Python #

With the help of SWIG, we can easily implement using Python to call C/C++ libraries and even inherit and use C++ classes. Next, let’s take a look at a PCA (Principal Component Analysis) algorithm implemented in Python that you are very familiar with.

Because our goal today is not to explain the PCA algorithm, it doesn’t matter if you’re not very familiar with it. I will directly provide the specific code, and we will focus on how to use SWIG. Below, I will first provide Code Listing 1.

Code Listing 1, PCA algorithm implemented in Python testPCAPurePython.py :

import numpy as np

def compute_pca(data):
    m = np.mean(data, axis=0)
    datac = np.array([obs - m for obs in data])
    T = np.dot(datac, datac.T)
    [u,s,v] = np.linalg.svd(T)

    pcs = [np.dot(datac.T, item) for item in u.T ]

    pcs = np.array([d / np.linalg.norm(d) for d in pcs])

    return pcs, m, s, T, u

def compute_projections(I,pcs,m):
    projections = []
    for i in I:
        w = []
        for p in pcs:
            w.append(np.dot(i - m, p))
        projections.append(w)
    return projections

def reconstruct(w, X, m,dim = 5):
    return np.dot(w[:dim],X[:dim,:]) + m

def normalize(samples, maxs = None):
    if not maxs:
        maxs = np.max(samples)
    return np.array([np.ravel(s) / maxs for s in samples])

Now, let’s save this piece of code and execute it with the following command:

python3 testPCAPurePython.py

Preparing SWIG #

Now, we have made some progress - we have written a PCA algorithm using Python and obtained some results. Next, let’s take a look at how to start SWIG development work. I will start by compiling the relevant components and then introduce a simple example for preparation.

First, let’s download the source code package from the SWIG website (http://swig.org/download.html) and start building it:

$ wget https://newcontinuum.dl.sourceforge.net/project/swig/swig/swig-4.0.0/swig-4.0.0.tar.gz # The download path may vary
$ tar -xvf swig-4.0.0.tar.gz
$ cd swig-4.0.0
$ wget https://ftp.pcre.org/pub/pcre/pcre-8.43.tar.gz # SWIG requires pcre as a dependency
$ sh ./Tools/pcre-build.sh # This script automatically builds pcre into the static library used by SWIG
$ ./configure # Note that bison needs to be installed. If not installed, the reader needs to install it manually
$ make
$ sudo make install

Once everything is ready, let’s write a simple example. This example is also from the SWIG website (http://swig.org/tutorial.html). Let’s first create a simple c file. You can use a text editor of your choice (such as vi) to create a file named example.c and write the code. The code is shown in Listing 2.

Listing 2, example.c:

#include <time.h>
double My_variable = 3.0;

int fact(int n) {
    if (n <= 1) return 1;
    else return n*fact(n-1);
}

int my_mod(int x, int y) {
    return (x%y);
}

char *get_time()
{
    time_t ltime;
    time(&ltime);
    return ctime(&ltime);
}

Next, let’s write an interface definition file named example.i and a Python script that will be used for testing later. The contents are shown in Listing 3 and Listing 4.

Listing 3, example.i:

%module example
%{
    /* Put header files here or function declarations like below */
    extern double My_variable;
    extern int fact(int n);
    extern int my_mod(int x, int y);
    extern char *get_time();
%}

extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();

Let me explain the code in Listing 3. In line 1, we define the module name as ’example’. In lines 2-8, we directly specify the function definitions from example.c. Alternatively, we can define a example.h header file and include these definitions there. Then, we include example.h within the %{ ... %} block to achieve the same functionality. Lines 10-13 define the exported interface so that you can directly call these interfaces in Python.

Listing 4, testExample.py:

import example
print(example.fact(5))
print(example.my_mod(7,3))
print(example.get_time())

Great, now we are ready. Let’s execute the following code to create the target file and the final linked file:

swig -python example.i
gcc -c -fPIC example.c example_wrap.c -I/usr/include/python3.6
gcc -shared example.o example_wrap.o -o _example.so
python3 testExample.py # Test the calls

In fact, from Listing 4, you can see that by importing example, we can directly call the C function interfaces and get return values in Python scripts.

Wrapping a Python module written in C++ using SWIG #

At this point, we have prepared a PCA algorithm written in C++, and now we are going to wrap it in a simple way. As C++ lacks official support for linear algebra, to simplify linear algebraic operations, I used a third-party library called Armadillo. On Ubuntu, it can be installed with apt-get install libarmadillo-dev.

Furthermore, I want to emphasize once again that the focus of today’s lesson is not on explaining the PCA algorithm itself. So I hope you don’t get stuck here and miss out on the actual usage. Of course, for the sake of completeness, I will still provide a basic explanation of the code.

Let’s start the wrapping process. First, we need to write a header file called pca.h, and I have included its content in Code Listing 5.

Code Listing 5, pca.h:

#pragma once

#include <vector>
#include <string>
#include <armadillo>

class pca {
public:
    pca();
    explicit pca(long num_vars);
    virtual ~pca();

    bool operator==(const pca& other);

    void set_num_variables(long num_vars);
    long get_num_variables() const;
    void add_record(const std::vector<double>& record);
    std::vector<double> get_record(long record_index) const;
    long get_num_records() const;
    void set_do_normalize(bool do_normalize);
    bool get_do_normalize() const;
    void set_solver(const std::string& solver);
    std::string get_solver() const;

    void solve();

    double check_eigenvectors_orthogonal() const;
    double check_projection_accurate() const;

    void save(const std::string& basename) const;
    void load(const std::string& basename);

    void set_num_retained(long num_retained);
    long get_num_retained() const;
    std::vector<double> to_principal_space(const std::vector<double>& record) const;
    std::vector<double> to_variable_space(const std::vector<double>& data) const;
    double get_energy() const;
    double get_eigenvalue(long eigen_index) const;
    std::vector<double> get_eigenvalues() const;
    std::vector<double> get_eigenvector(long eigen_index) const;
    std::vector<double> get_principal(long eigen_index) const;
    std::vector<double> get_mean_values() const;
    std::vector<double> get_sigma_values() const;

protected:
    long num_vars_;
    long num_records_;
    long record_buffer_;
    std::string solver_;
    bool do_normalize_;
    long num_retained_;
    arma::Mat<double> data_;
    arma::Col<double> energy_;
    arma::Col<double> eigval_;
    arma::Mat<double> eigvec_;
    arma::Mat<double> proj_eigvec_;
    arma::Mat<double> princomp_;
    arma::Col<double> mean_;
    arma::Col<double> sigma_;
    void initialize_();
    void assert_num_vars_();
    void resize_data_if_needed_();
};

Next, let’s write the implementation in pca.cpp, which is the content of Code Listing 6.

Code Listing 6, pca.cpp:

#include "pca.h"
#include "utils.h"
#include <stdexcept>
#include <random>

pca::pca()
    : num_vars_(0),
      num_records_(0),
      record_buffer_(1000),
      solver_("dc"),
      do_normalize_(false),
      num_retained_(1),
      energy_(1)
{}

pca::pca(long num_vars)
    : num_vars_(num_vars),
      num_records_(0),
      record_buffer_(1000),
      solver_("dc"),
      do_normalize_(false),
      num_retained_(num_vars_),
      data_(record_buffer_, num_vars_),
      energy_(1),
      eigval_(num_vars_),
      eigvec_(num_vars_, num_vars_),
      proj_eigvec_(num_vars_, num_vars_),
      princomp_(record_buffer_, num_vars_),
      mean_(num_vars_),
      sigma_(num_vars_)
{
    assert_num_vars_();
    initialize_();
}

pca::~pca()
{}

bool pca::operator==(const pca& other) {
    const double eps = 1e-5;
    if (num_vars_ == other.num_vars_ &&
        num_records_ == other.num_records_ &&
        record_buffer_ == other.record_buffer_ &&
        solver_ == other.solver_ &&
        do_normalize_ == other.do_normalize_ &&
        num_retained_ == other.num_retained_ &&
        utils::is_approx_equal_container(eigval_, other.eigval_, eps) &&
        utils::is_approx_equal_container(eigvec_, other.eigvec_, eps) &&
        utils::is_approx_equal_container(princomp_, other.princomp_, eps) &&
        utils::is_approx_equal_container(energy_, other.energy_, eps) &&
        utils::is_approx_equal_container(mean_, other.mean_, eps) &&
        utils::is_approx_equal_container(sigma_, other.sigma_, eps) &&
        utils::is_approx_equal_container(proj_eigvec_, other.proj_eigvec_, eps))
        return true;
    else
        return false;
}

void pca::resize_data_if_needed_() {
    if (num_records_ == record_buffer_) {
        record_buffer_ += record_buffer_;
        data_.resize(record_buffer_, num_vars_);
    }
}

void pca::assert_num_vars_() {
    if (num_vars_ < 2)
        throw std::invalid_argument("Number of variables smaller than two.");
}

void pca::initialize_() {
    data_.zeros();
    eigval_.zeros();
    eigvec_.zeros();
    princomp_.zeros();
    mean_.zeros();
    sigma_.zeros();
}

sigma_.zeros();
energy_.zeros();
}

void pca::set_num_variables(long num_vars) {
num_vars_ = num_vars;
assert_num_vars_();
num_retained_ = num_vars_;
data_.resize(record_buffer_, num_vars_);
eigval_.resize(num_vars_);
eigvec_.resize(num_vars_, num_vars_);
mean_.resize(num_vars_);
sigma_.resize(num_vars_);
initialize_();
}

void pca::add_record(const std::vector<double>& record) {
assert_num_vars_();

if (num_vars_ != long(record.size()))
throw std::domain_error(utils::join("Record has the wrong size: ", record.size()));

resize_data_if_needed_();
arma::Row<double> row(&record.front(), record.size());
data_.row(num_records_) = std::move(row);
++num_records_;
}

std::vector<double> pca::get_record(long record_index) const {
return std::move(utils::extract_row_vector(data_, record_index));
}

void pca::set_do_normalize(bool do_normalize) {
do_normalize_ = do_normalize;
}

void pca::set_solver(const std::string& solver) {
if (solver!="standard" && solver!="dc")
throw std::invalid_argument(utils::join("No such solver available: ", solver));
solver_ = solver;
}

void pca::solve() {
assert_num_vars_();

if (num_records_ < 2)
throw std::logic_error("Number of records smaller than two.");

data_.resize(num_records_, num_vars_);

mean_ = utils::compute_column_means(data_);
utils::remove_column_means(data_, mean_);

sigma_ = utils::compute_column_rms(data_);
if (do_normalize_) utils::normalize_by_column(data_, sigma_);

arma::Col<double> eigval(num_vars_);
arma::Mat<double> eigvec(num_vars_, num_vars_);

arma::Mat<double> cov_mat = utils::make_covariance_matrix(data_);
arma::eig_sym(eigval, eigvec, cov_mat, solver_.c_str());
arma::uvec indices = arma::sort_index(eigval, 1);

for (long i=0; i<num_vars_; ++i) {
eigval_(i) = eigval(indices(i));
eigvec_.col(i) = eigvec.col(indices(i));
}

utils::enforce_positive_sign_by_column(eigvec_);
proj_eigvec_ = eigvec_;

princomp_ = data_ * eigvec_;

energy_(0) = arma::sum(eigval_);
eigval_ *= 1./energy_(0);
}

void pca::set_num_retained(long num_retained) {
if (num_retained<=0 || num_retained>num_vars_)
throw std::range_error(utils::join("Value out of range: ", num_retained));

num_retained_ = num_retained;
proj_eigvec_ = eigvec_.submat(0, 0, eigvec_.n_rows-1, num_retained_-1);
}

std::vector<double> pca::to_principal_space(const std::vector<double>& data) const {
arma::Col<double> column(&data.front(), data.size());
column -= mean_;
if (do_normalize_) column /= sigma_;
const arma::Row<double> row(column.t() * proj_eigvec_);
return std::move(utils::extract_row_vector(row, 0));
}

std::vector<double> pca::to_variable_space(const std::vector<double>& data) const {
const arma::Row<double> row(&data.front(), data.size());
arma::Col<double> column(arma::trans(row * proj_eigvec_.t()));
if (do_normalize_) column %= sigma_;
column += mean_;
return std::move(utils::extract_column_vector(column, 0));
}

double pca::get_energy() const {
return energy_(0);
}

double pca::get_eigenvalue(long eigen_index) const {
if (eigen_index >= num_vars_)
throw std::range_error(utils::join("Index out of range: ", eigen_index));
return eigval_(eigen_index);
}

std::vector<double> pca::get_eigenvalues() const {
return std::move(utils::extract_column_vector(eigval_, 0));
}

std::vector<double> pca::get_eigenvector(long eigen_index) const {
return std::move(utils::extract_column_vector(eigvec_, eigen_index));
}

std::vector<double> pca::get_principal(long eigen_index) const {
return std::move(utils::extract_column_vector(princomp_, eigen_index));
}

double pca::check_eigenvectors_orthogonal() const {
return std::abs(arma::det(eigvec_));
}

double pca::check_projection_accurate() const {
if (data_.n_cols!=eigvec_.n_cols || data_.n_rows!=princomp_.n_rows)
throw std::runtime_error("No proper data matrix present that the projection could be compared with.");
const arma::Mat<double> diff = (princomp_ * arma::trans(eigvec_)) - data_;
return 1 - arma::sum(arma::sum( arma::abs(diff) )) / diff.n_elem;
}

bool pca::get_do_normalize() const {
return do_normalize_;
}

std::string pca::get_solver() const {
return solver_;
}

std::vector<double> pca::get_mean_values() const {
return std::move(utils::extract_column_vector(mean_, 0));
}

std::vector<double> pca::get_sigma_values() const {
return std::move(utils::extract_column_vector(sigma_, 0));
}

long pca::get_num_variables() const {
return num_vars_;
}

long pca::get_num_records() const {
return num_records_;
}

#pragma once

#include <armadillo>
#include <sstream>

namespace utils {
arma::Mat<double> make_covariance_matrix(const arma::Mat<double>& data);
arma::Mat<double> make_shuffled_matrix(const arma::Mat<double>& data);
arma::Col<double> compute_column_means(const arma::Mat<double>& data);
void remove_column_means(arma::Mat<double>& data, const arma::Col<double>& means);
arma::Col<double> compute_column_rms(const arma::Mat<double>& data);
void normalize_by_column(arma::Mat<double>& data, const arma::Col<double>& rms);
void enforce_positive_sign_by_column(arma::Mat<double>& data);
std::vector<double> extract_column_vector(const arma::Mat<double>& data, long index);
std::vector<double> extract_row_vector(const arma::Mat<double>& data, long index);
void assert_file_good(const bool& is_file_good, const std::string& filename);
template<typename T>
void write_matrix_object(const std::string& filename, const T& matrix) {
    assert_file_good(matrix.quiet_save(filename, arma::arma_ascii), filename);
}

template<typename T>
void read_matrix_object(const std::string& filename, T& matrix) {
    assert_file_good(matrix.quiet_load(filename), filename);
}
template<typename T, typename U, typename V>
bool is_approx_equal(const T& value1, const U& value2, const V& eps) {
    return std::abs(value1-value2)<eps ? true : false;
}
template<typename T, typename U, typename V>
bool is_approx_equal_container(const T& container1, const U& container2, const V& eps) {
    if (container1.size()==container2.size()) {
        bool equal = true;
        for (size_t i=0; i<container1.size(); ++i) {
            equal = is_approx_equal(container1[i], container2[i], eps);
            if (!equal) break;
        }
        return equal;
    } else {
        return false;
    }
}
double get_mean(const std::vector<double>& iter);
double get_sigma(const std::vector<double>& iter);

struct join_helper {
    static void add_to_stream(std::ostream& stream) {}

    template<typename T, typename... Args>
    static void add_to_stream(std::ostream& stream, const T& arg, const Args&... args) {
        stream << arg;
        add_to_stream(stream, args...);
    }
};

template<typename T, typename... Args>
std::string join(const T& arg, const Args&... args) {
    std::ostringstream stream;
    stream << arg;
    join_helper::add_to_stream(stream, args...);
    return stream.str();
}

template<typename T>
void write_property(std::ostream& file, const std::string& key, const T& value) {
    file << key << "\t" << value << std::endl;
}

template<typename T>
void read_property(std::istream& file, const std::string& key, T& value) {
    std::string tmp;
    bool found = false;
    while (file.good()) {
        file >> tmp;
        if (tmp==key) {
            file >> value;
            found = true;
            break;
        }
    }
    if (!found)
        throw std::domain_error(join("No such key available: ", key));
    file.seekg(0);
}

} //utils

And the specific implementation code is in the utils.cpp as follows:

#include "utils.h"
#include <stdexcept>
#include <sstream>
#include <numeric>

namespace utils {

arma::Mat<double> make_covariance_matrix(const arma::Mat<double>& data) {
    return std::move( (data.t()*data) * (1./(data.n_rows-1)) );
}

arma::Mat<double> make_shuffled_matrix(const arma::Mat<double>& data) {
    const long n_rows = data.n_rows;

const long n_cols = data.n_cols;
arma::Mat<double> shuffle(n_rows, n_cols);
for (long j=0; j<n_cols; ++j) {
    for (long i=0; i<n_rows; ++i) {
        shuffle(i, j) = data(std::rand()%n_rows, j);
    }
}
return std::move(shuffle);

arma::Col<double> compute_column_means(const arma::Mat<double>& data) {
    const long n_cols = data.n_cols;
    arma::Col<double> means(n_cols);
    for (long i=0; i<n_cols; ++i)
        means(i) = arma::mean(data.col(i));
    return std::move(means);
}

void remove_column_means(arma::Mat<double>& data, const arma::Col<double>& means) {
    if (data.n_cols != means.n_elem)
        throw std::range_error("Number of elements of means is not equal to the number of columns of data");
    for (long i=0; i<long(data.n_cols); ++i)
        data.col(i) -= means(i);
}

arma::Col<double> compute_column_rms(const arma::Mat<double>& data) {
    const long n_cols = data.n_cols;
    arma::Col<double> rms(n_cols);
    for (long i=0; i<n_cols; ++i) {
        const double dot = arma::dot(data.col(i), data.col(i));
        rms(i) = std::sqrt(dot / (data.col(i).n_rows-1));
    }
    return std::move(rms);
}

void normalize_by_column(arma::Mat<double>& data, const arma::Col<double>& rms) {
    if (data.n_cols != rms.n_elem)
        throw std::range_error("Number of elements of rms is not equal to the number of columns of data");
    for (long i=0; i<long(data.n_cols); ++i) {
        if (rms(i)==0)
            throw std::runtime_error("At least one of the entries of rms equals to zero");
        data.col(i) *= 1./rms(i);
    }
}

void enforce_positive_sign_by_column(arma::Mat<double>& data) {
    for (long i=0; i<long(data.n_cols); ++i) {
        const double max = arma::max(data.col(i));
        const double min = arma::min(data.col(i));
        bool change_sign = false;
        if (std::abs(max)>=std::abs(min)) {
            if (max<0) change_sign = true;
        } else {
            if (min<0) change_sign = true;
        }
        if (change_sign) data.col(i) *= -1;
    }
}

std::vector<double> extract_column_vector(const arma::Mat<double>& data, long index) {
    if (index<0 || index >= long(data.n_cols))
        throw std::range_error(join("Index out of range: ", index));
    const long n_rows = data.n_rows;
    const double* memptr = data.colptr(index);
    std::vector<double> result(memptr, memptr + n_rows);
    return std::move(result);
}

std::vector<double> extract_row_vector(const arma::Mat<double>& data, long index) {
    if (index<0 || index >= long(data.n_rows))
        throw std::range_error(join("Index out of range: ", index));
    const arma::Row<double> row(data.row(index));
    const double* memptr = row.memptr();
    std::vector<double> result(memptr, memptr + row.n_elem);
    return std::move(result);
}

void assert_file_good(const bool& is_file_good, const std::string& filename) {
    if (!is_file_good)
        throw std::ios_base::failure(join("Cannot open file: ", filename));
}

double get_mean(const std::vector<double>& iter) {
    const double init = 0;
    return std::accumulate(iter.begin(), iter.end(), init) / iter.size();
}

double get_sigma(const std::vector<double>& iter) {
    const double mean = get_mean(iter);
    double sum = 0;
    for (auto v=iter.begin(); v!=iter.end(); ++v)
        sum += std::pow(*v - mean, 2.);
    return std::sqrt(sum/(iter.size()-1));
}

SWIG C++ Common Tools #

By now, you should be ready to start working with SWIG and use the code examples provided as your tools for practice. However, SWIG itself is very rich, so here I will also summarize and introduce several commonly used tools for you.

1. Global Variables #

In Python, we can use cvar to access global variables defined in C++ code.

For example, if we define a global variable in the header file sample.h and reference it in sample.i, which is the content of code examples 11 and 12.

Code example 11, sample.h:

#include <cstdint>
int32_t score = 100;

Code example 12, sample.i:

%module sample
%{
#include "sample.h"
%}

%include "sample.h"

In this way, we can directly access the corresponding global variable in the Python script using cvar, as shown in code example 13, and the output result will be 100.

Code example 13, sample.py:

import sample
print sample.cvar.score

2. Constants #

We can set constants in the interface definition file using %constant, as shown in code example 14.

Code example 14, sample.i:

%constant int foo = 100;
%constant const char* bar = "foobar2000";

3. Enumeration #

We can define enumeration in the interface file using the enum keyword.

4. Pointers and References #

In the C++ world, pointers are an inseparable concept. They are everywhere and we need to use them all the time. Therefore, I think it is necessary to introduce how to operate on pointers and references in C++ here.

SWIG has good support for pointers and some support for smart pointers. In recent update logs, I found that its support for smart pointers has been constantly updated. The following code examples 15 and 16 demonstrate the usage of pointers and references.

Code example 15, sample.h:

#include <cstdint>

void passPointer(ClassA* ptr) {
   printf("result= %d", ptr->result);
}

void passReference(const ClassA& ref) {
   printf("result= %d", ref.result);
}

void passValue(ClassA obj) {
   printf("result= %d", obj.result);
}

Code example 16, sample.py:

import sample

a = ClassA() # create an instance of ClassA
passPointer(a)
passReference(a)
passValue(a)

5. Strings #

In industrial-grade code, we often use std::string. In the context of SWIG, using strings from the standard library requires you to declare %include "std_string.i" in the interface file to ensure automatic conversion from C++ std::string to Python str. The specific content is shown in code example 17.

Code example 17, sample.i:

%module sample

%include "std_string.i"

6. Vectors #

std::vector is the most common and frequently used sequential container in STL. Due to its template class nature, its usage is slightly more complex than that of strings and requires the use of %template for declaration. The details are shown in code example 18.

Code example 18, sample.i:

%module sample

%include "std_string.i"
%include "std_vector.i"

namespace std {
 %template(DoubleVector) vector<double>;
}

7. Maps #

std::map is also one of the most common and frequently used containers in STL. Similarly, its template class is also special and requires the use of %template for declaration. The details can be seen in code example 19.

Code example 19, sample.i:

%module sample

%include "std_string.i"
%include "std_map.i"

namespace std {
 %template(Int2strMap) map<int, string>;
 %template(Str2intMap) map<string, int>;
}

Learning Path #

So far, we have achieved the goal of getting started with SWIG. Today’s content can be treated as a SWIG programming guide. I have provided you with 19 code examples that you can use to get hands-on experience. However, if you want to further improve in this area, what should you do? Don’t worry, at the end of today’s lesson, I will share with you an efficient learning path for SWIG.

First of all, it is important not to stray from the official documentation when learning any technology. The SWIG website provides incredibly detailed documentation, which is undoubtedly the best way to master the usage of SWIG.

Secondly, to delve deeper into SWIG, it is crucial to have a comprehensive understanding of C++. C++ is always an essential topic in high-performance computing, especially when it comes to memory management, pointers, and the use of virtual functions. You need to gain practical experience in writing C++ code in order to gradually master these concepts. Even if you are only looking to wrap other C++ libraries for Python usage, it is still necessary to have a basic understanding of C++. This will help you find directions to solve any compilation or linking errors you may encounter in the future.

Lastly, I will list some learning materials for your further reference.

The first is the SWIG documentation.

a. http://www.swig.org/doc.html
b. http://www.swig.org/Doc4.0/SWIGPlus.html
c. PDF version: http://www.swig.org/Doc4.0/SWIGDocumentation.pdf

The second is the book “C++ Primer”. As a classic book in the field of C++, it provides great help in gaining a comprehensive understanding of C++.

The third is the book “Advanced C/C++ Compilation Techniques”. This book covers more advanced topics and can be used to improve and expand your knowledge of C++.

Alright, that concludes today’s content. If you have any gains or questions about SWIG, feel free to leave a message and share your thoughts. You are also welcome to share this article with your colleagues and friends so that we can learn and progress together.