📜  ML –从零开始在C ++中进行神经网络实现

📅  最后修改于: 2021-06-01 01:10:54             🧑  作者: Mango


现在,如果您已经用其他某种编程语言实现了神经网络模型,那么您可能已经注意到(如果您使用的是低端PC),即使在很小的数据集上,模型的运行速度也相当慢。当您开始学习神经网络时,您可能已经搜索过哪种语言最适合机器学习?而且显而易见的答案是Python或R最适合机器学习,其他语言则很难,因此您一定不要在它们上面浪费时间! 。现在,如果用户开始编程,他们将面临时间和资源消耗的问题。因此,本文展示了如何建立一个超快速的神经网络。


  • 关于什么是类以及它们如何工作的基本知识。
  • 使用称为Eigen的线性代数库
  • C++中的一些基本读写操作
  • 关于线性代数的一些基本知识,因为我们正在使用一个库


  • 入门!
  • 本征矩阵类


  • 神经网络基础
  • 神经网络中的正向和反向传播


// NeuralNetwork.hpp
// use typedefs for future ease for changing data types like : float to double
typedef float Scalar;
typedef Eigen::MatrixXf Matrix;
typedef Eigen::RowVectorXf RowVector;
typedef Eigen::VectorXf ColVector;
// neural network implementation class!
class NeuralNetwork {
    // constructor
    NeuralNetwork(std::vector topology, Scalar learningRate = Scalar(0.005));
    // function for forward propagation of data
    void propagateForward(RowVector& input);
    // function for backward propagation of errors made by neurons
    void propagateBackward(RowVector& output);
    // function to calculate errors made by neurons in each layer
    void calcErrors(RowVector& output);
    // function to update the weights of connections
    void updateWeights();
    // function to train the neural network give an array of data points
    void train(std::vector data);
    // storage objects for working of neural network
          use pointers when using std::vector as std::vector calls destructor of 
          Class as soon as it is pushed back! when we use pointers it can't do that, besides
          it also makes our neural network class less heavy!! It would be nice if you can use
          smart pointers instead of usual ones like this
    std::vector neuronLayers; // stores the different layers of out network
    std::vector cacheLayers; // stores the unactivated (activation fn not yet applied) values of layers
    std::vector deltas; // stores the error contribution of each neurons
    std::vector weights; // the connection weights itself
    Scalar learningRate;

接下来,我们一步一步地实现每个函数。但是,首先,创建两个文件(NeuralNetwork.cpp和NeuralNetwork.hpp),并在“ NeuralNetwork.hpp”中编写上面的NeuralNetwork类代码。必须将以下代码行复制到“ NeuralNetwork.cpp”文件中。


// constructor of neural network class
NeuralNetwork::NeuralNetwork(std::vector topology, Scalar learningRate)
    this->topology = topology;
    this->learningRate = learningRate;
    for (uint i = 0; i < topology.size(); i++) {
        // initialze neuron layers
        if (i == topology.size() - 1)
            neuronLayers.push_back(new RowVector(topology[i]));
            neuronLayers.push_back(new RowVector(topology[i] + 1));
        // initialize cache and delta vectors
        cacheLayers.push_back(new RowVector(neuronLayers.size()));
        deltas.push_back(new RowVector(neuronLayers.size()));
        // vector.back() gives the handle to recently added element
        // coeffRef gives the reference of value at that place 
        // (using this as we are using pointers here)
        if (i != topology.size() - 1) {
            neuronLayers.back()->coeffRef(topology[i]) = 1.0;
            cacheLayers.back()->coeffRef(topology[i]) = 1.0;
        // initialze weights matrix
        if (i > 0) {
            if (i != topology.size() - 1) {
                weights.push_back(new Matrix(topology[i - 1] + 1, topology[i] + 1));
                weights.back()->coeffRef(topology[i - 1], topology[i]) = 1.0;
            else {
                weights.push_back(new Matrix(topology[i - 1] + 1, topology[i]));

我们将用于矩阵维的一种表示法是: [mn]表示具有m行n列的矩阵。

初始化权重矩阵有点棘手! (数学上)。在接下来的几行中,请非常注意您所阅读的内容,因为这将解释我们如何在本文中使用权重矩阵。我假设您知道神经网络中各层如何相互连接。

  • 权重矩阵中的第c列表示CURRENT_LAYER中的c个神经元与PREV_LAYER中的所有神经元的连接。
  • 权重矩阵中第c列的第r个元素表示CURRENT_LAYER中的第c个神经元与PREV_LAYER中的第r个神经元的连接。
  • 权重矩阵中的第r行表示PREV_LAYER中所有神经元与CURRENT_LAYER中第r神经元的连接。
  • 权重矩阵中第r行的第c个元素表示PREV_LAYER中的第c个神经元与CURRENT_LAYER中的第r个神经元的连接。
  • 当我们在正常意义上使用权重矩阵时,将使用第1点和第2点,但是当我们在转置意义上使用权重矩阵时,将使用第3点和第4点(a(i,j)= a(j,I))



void NeuralNetwork::propagateForward(RowVector& input)
    // set the input to input layer
    // block returns a part of the given vector or matrix
    // block takes 4 arguments : startRow, startCol, blockRows, blockCols
    neuronLayers.front()->block(0, 0, 1, neuronLayers.front()->size() - 1) = input;
    // propagate the data forawrd
    for (uint i = 1; i < topology.size(); i++) {
        // already explained above
        (*neuronLayers[i]) = (*neuronLayers[i - 1]) * (*weights[i - 1]);
    // apply the activation function to your network
    // unaryExpr applies the given function to all elements of CURRENT_LAYER
    for (uint i = 1; i < topology.size() - 1; i++) {
        neuronLayers[i]->block(0, 0, 1, topology[i]).unaryExpr(std::ptr_fun(activationFunction));



void NeuralNetwork::calcErrors(RowVector& output)
    // calculate the errors made by neurons of last layer
    (*deltas.back()) = output - (*neuronLayers.back());
    // error calculation of hidden layers is different
    // we will begin by the last hidden layer
    // and we will continue till the first hidden layer
    for (uint i = topology.size() - 2; i > 0; i--) {
        (*deltas[i]) = (*deltas[i + 1]) * (weights[i]->transpose());


void NeuralNetwork::updateWeights()
    // topology.size()-1 = weights.size()
    for (uint i = 0; i < topology.size() - 1; i++) {
        // in this loop we are iterating over the different layers (from first hidden to output layer)
        // if this layer is the output layer, there is no bias neuron there, number of neurons specified = number of cols
        // if this layer not the output layer, there is a bias neuron and number of neurons specified = number of cols -1
        if (i != topology.size() - 2) {
            for (uint c = 0; c < weights[i]->cols() - 1; c++) {
                for (uint r = 0; r < weights[i]->rows(); r++) {
                    weights[i]->coeffRef(r, c) += learningRate * deltas[i + 1]->coeffRef(c) * activationFunctionDerivative(cacheLayers[i + 1]->coeffRef(c)) * neuronLayers[i]->coeffRef(r);
        else {
            for (uint c = 0; c < weights[i]->cols(); c++) {
                for (uint r = 0; r < weights[i]->rows(); r++) {
                    weights[i]->coeffRef(r, c) += learningRate * deltas[i + 1]->coeffRef(c) * activationFunctionDerivative(cacheLayers[i + 1]->coeffRef(c)) * neuronLayers[i]->coeffRef(r);


void NeuralNetwork::propagateBackward(RowVector& output)


Scalar activationFunction(Scalar x)
    return tanhf(x);
Scalar activationFunctionDerivative(Scalar x)
    return 1 - tanhf(x) * tanhf(x);
// you can use your own code here!


void NeuralNetwork::train(std::vector input_data, std::vector output_data)
    for (uint i = 0; i < input_data.size(); i++) {
        std::cout << "Input to neural network is : " << *input_data[i] << std::endl;
        std::cout << "Expected output is : " << *output_data[i] << std::endl;
        std::cout << "Output produced is : " << *neuronLayers.back() << std::endl;
        std::cout << "MSE : " << std::sqrt((*deltas.back()).dot((*deltas.back())) / deltas.back()->size()) << std::endl;


void ReadCSV(std::string filename, std::vector& data)
    std::ifstream file(filename);
    std::string line, word;
    // determine number of columns in file
    getline(file, line, '\n');
    std::stringstream ss(line);
    std::vector parsed_vec;
    while (getline(ss, word, ', ')) {
    uint cols = parsed_vec.size();
    data.push_back(new RowVector(cols));
    for (uint i = 0; i < cols; i++) {
        data.back()->coeffRef(1, i) = parsed_vec[i];
    // read the file
    if (file.is_open()) {
        while (getline(file, line, '\n')) {
            std::stringstream ss(line);
            data.push_back(new RowVector(1, cols));
            uint i = 0;
            while (getline(ss, word, ', ')) {
                data.back()->coeffRef(i) = Scalar(std::stof(&word[0]));



void genData(std::string filename)
    std::ofstream file1(filename + "-in");
    std::ofstream file2(filename + "-out");
    for (uint r = 0; r < 1000; r++) {
        Scalar x = rand() / Scalar(RAND_MAX);
        Scalar y = rand() / Scalar(RAND_MAX);
        file1 << x << ", " << y << std::endl;
        file2 << 2 * x + 10 + y << std::endl;


// main.cpp
// don't forget to include out neural network
#include "NeuralNetwork.hpp"
//... data generator code here
typedef std::vector data;
int main()
    NeuralNetwork n({ 2, 3, 1 });
    data in_dat, out_dat;
    ReadCSV("test-in", in_dat);
    ReadCSV("test-out", out_dat);
    n.train(in_dat, out_dat);
    return 0;

g ++ main.cpp NeuralNetwork.cpp -o main && ./main
