caffe源码深入学习

时间：2020-10-07 21:44:58 阅读：42 评论：0 收藏：0 [点我收藏+]

标签：relative clear diff mutex 灵活 original 文章探索 nts

Google：inline Forward Caffe

作者:jiongnima

这个作者很懒，什么都没留下…

原创干货！caffe源码深入学习9：caffe框架神经网络反传代码解析（三）之contrastive_loss_layer源码解析

本篇博客是Caffe深度学习梯度反传代码解析的第3篇。从本篇博客开始，在对Caffe框架中的反传代码进行解析时，笔者将解析更复杂的实现代码，比如包含可训练参数的层与复杂求导过程的层。本片博客解析了对比损失层，即contrastive_loss_layer，对比损失是在人脸验证，图像检索中使用的非常广泛的一个层。希望给大家带来收获，欢迎阅读与分享！

2019-10-13 14:57:11 283 0
原创 caffe初探4：对训练得到的模型进行测试

续caffe初探1，2，3，在我们训练出自己的模型之后，就可以测试，或者说使用我们的模型来分类了，在我们使用网络模型对单张图片进行分类测试之前笔者还是列出测试所需物资清单：兹测试所需物资清单如下：（1）类名文件，未准备。标定分类名称的txt文件。（2）测试图片，未准备。准备若干张供网络模型分类的图片。（3）后缀名称为.caffemodel的网络模型文件，已准备。笔者

2016-09-27 16:23:53 15328 8
原创 tensorflow2caffe(2) : 如何在tensorflow中取出模型参数

本文是tensorflow2caffe的第二步，讲述了如何从tensorflow中提取出训练参数，分享给大家~

2017-10-24 10:12:22 7065 7
原创 tensorflow2caffe(3) : 如何将tensorflow框架下训练得到的权重转化为caffe框架下的权重参数

在前两期专栏tensorflow2caffe(1)和tensorflow2caffe(2)中，笔者向大家介绍了caffemodel文件类型下的参数架构和如何取出tensorflow框架下训练参数。在本期中，笔者将向大家阐述，如何去将tensorflow框架下训练得到的参数转化为caffe框架下的规定格式的参数。首先，我们来捋一捋目前我们手里面已经有了哪些东西： 1. 我们有自己的tenso...

2017-10-29 09:50:25 10874 61
原创 tensorflow2caffe(1) : caffemodel解析，caffemodel里面到底记录了什么？

本系列开始介绍如何进行tensorflow到caffe的框架转换。由于目前介绍从tensorflow里面取出训练参数的文章比较多，可是解析caffe框架下生成的模型参数的文章比较少，本文讲解了如何可视化生成的caffemodel。

2017-06-07 22:14:24 19837 31
原创 tensorflow2caffe(4) : caffemodel的生成与tensorflow2caffe框架转换的总结

本篇是tensorflow2caffe模型转换的最终章，描述了如何生成caffemodel和对tensorflow2caffe模型转换的一些总结，欢迎阅读与分享~

2017-11-03 14:43:39 5278 24
原创 FCN训练不收敛的原因分析和最详细的FCN训练与测试自己的数据程序配置

本文分析了FCN训练不收敛的原因并给出了解决方案。同时给出了详尽的完善的FCN训练与测试程序配置，欢迎阅读与分享。

2017-11-16 11:31:10 16055 172
原创详细的Faster R-CNN源码解析之ROI-Pooling逐行代码解析

在笔者的上一篇博客中，解析了Faster R-CNN中的RPN代码，在本篇博客中，笔者详细地解析一下ROI-Pooling代码。为大家讲解2015年Fast R-CNN的核心贡献(ROI Pooling被Faster R-CNN沿用)ROI Pooling的实现原理。(笔者其实一年半之前就看过这个代码，只是当时没有写到博客上，感慨.jpg) 在代码解析正式开始之前，笔者声明几点：1. 本...

2018-04-20 19:45:56 10429 12
原创重启caffe源码深入学习7：caffe框架深度神经网络反传代码解析（一）之ReLU层源码解析

近年来，TensorFlow和PyTorch等安装简洁，使用灵活，不需要构造反传代码的深度学习框架越来越多地走进了深度学习研究者的视野，为大家提供了很多便捷。不过同时，也造就了很多**调包侠**，调包侠们既编程能力有限，也不明了底层原理，仅仅知道调包解决问题，脱离深度学习框架之后无法解决深层次的深度学习问题。因此，笔者重重启caffe源码解析博客，为大家分享深度学习底层实现相关干货！

2019-10-03 23:53:51 293 0
原创 caffe源码深入学习4：支持魔改的layer：layer.hpp与layer.cpp

到caffe源码深入学习3为止，我们解析了caffe底层的数据相关代码，了解了caffe这个深度学习框架中数据的存储与流通实现细节，那么，从本篇博客开始，笔者将开始解析更高层的代码，首先解析的是caffe中构成深度神经网络的网络层layer，在使用caffe架构的程序员眼中，各种layer就像一块一块的积木一般，可以通过搭建，拼接成各种各样好看的玩具城堡，同时，这些积木本身还支持各种魔改，在...

2017-02-20 21:26:08 1595 1
原创 caffe源码深入学习8：caffe框架深度神经网络反传代码解析（二）之pooling层源码解析

caffe源码深入学习8：caffe框架深度神经网络反传代码解析（二）之pooling层源码解析写在前面池化层源码及注释池化层源码解析最大池化平均池化写在前面在上一篇博客，即重启caffe源码深入学习7中，笔者从最简单的激活层开始，进行了caffe源码的解析，尤其讲述了梯度反传的部分。在本篇博客中，笔者将解析另一个基础层的源码，即池化层，池化层与激活层类似，其中不包含任何可训练参数。caffe...

2019-10-07 21:23:22 766 0
原创在c++程序中调用caffe训练完毕的模型进行分类

本文讲述了如何在c++程序中调用caffe训练好的模型进行分类，并使用cmake编译分类文件。同时附上classification.cpp解析

2017-04-16 20:39:43 22173 47
原创 caffe工程化实例：将caffe分类程序编译成.so文件并在程序中调用

在上一篇博客中，笔者提到了对caffe分类程序的工程化，那么本篇博客笔者就来实践一下对caffe分类的工程化。首先进行一下需求分析：在工程中，往往是用一张图片作为输入，并返回该输入图片的分类结果，也就是说，需要把分类程序放在一个.so链接库中，并在主函数中调用该链接库。请各位读者朋友们注意，笔者在进行实验时，还是引用了上一篇博客中提到的检测图片中是否包含岔路口的模型，因此请对实

2017-04-16 23:18:12 9729 31
原创 nvidia jetson TX2配置caffe

本文阐述了nvidia jetson TX2配置caffe的过程

2017-04-10 23:54:37 24711 36
原创 caffe初探3：结合数据集与设计的网络模型进行训练

续caffe初探1和caffe初探2，回首一下，此时已经有一些收获了呢，已经生成了数据集，并准备了均值文件还有网络结构文件，现在就可以进行模型的训练了。首先，我们来清点一下训练所需要的物资清单吧。兹训练物资清单如下：（1）数据集准备完毕，分别是./caffe/forkrecognition/train_lmdb和./caffe/forkreco

2016-09-27 10:26:20 3873 3
原创 caffe初探2：有关网络设计的探索

续caffe初探1大笑大笑大笑有兴趣的朋友可以关注笔者的博客，笔者作为一个初涉深度学习领域和caffe的rookie，很高兴同大家一起学习和探讨，对于本人博客中的谬误与疏漏，笔者诚恳地期待各位读者朋友们的指点与建议。在caffe初探1中，笔者提到了如何制作自己的数据集，那么在此文中，笔者将讲解如何撰写网络架构。众所周知，在卷及神经网络中，网络架构是最重要的一部分，网络的构造与模型效果的

2016-09-27 10:18:01 4067 4
原创 caffe初探1：生成自己的数据集

之前的博文转载过如何配置caffe，在配置好caffe之后呢，我们就可以利用caffe结合自己设计的网络与制作的数据集来训练模型并且测试了。可是，一口吃一个胖子是很愚蠢的，笔者希望通过若干个系列的博客告诉大家如何一步一步地训练出自己的模型并测试。这若干个系列的博客笔者打算这样安排：（1）数据的准备与训练数据集生成（2）有关caffe网络设计的探索（3）结合数据集

2016-09-26 20:07:48 8411 15
原创 caffe源码深入学习3：更底层的数据信息存取与交换代码：syncedmem.hpp和syncedmem.cpp

还记得在上一期博客中，当我们解析Blob类相关时，遇到一些成员函数如Update()，cpu_data()，mutable_gpu_data()等等，这些函数在完成对应功能的同时，调用了更多底层的函数，这些函数与Blob中的函数同名，如cpu_data()，gpu_data()，mutable_cpu_data()，mutable_gpu_data()等，在上一篇博客中的代码注释部分我们提到，这些

2017-02-19 09:53:33 1440 0
原创 caffe源码深入学习1：caffe.cpp解析

距离笔者接触深度学习已经将近半年了，在这段时间中，笔者最先接触的是lenet网络，然后就学习了2015-2016年非常火爆的fast-rcnn与faster-rcnn，到最近自己利用深度学习搞事情，笔者的最大感受是，经过一些例子的实践，已经对深度学习有了大概的了解，但是离熟练上手还有很可观的距离，这时，笔者不由得想起来一句老话：read the fxxx source code。因此，笔者开始学习

2017-02-13 20:20:35 8106 5
原创 caffe卷积层延伸：卷积核膨胀详细解析

在笔者的上一篇博客中，解析caffe的卷积层时，在conv_layer.cpp中有一个卷积核膨胀操作，在conv_layer.cpp的第17行有如下代码 const int kernel_extent = dilation_data[i] * (kernel_shape_data[i] - 1) + 1; 上面的代码描述了卷积核的膨胀操作，我们不妨来做个假设，卷积核为3*3的，膨胀系数为

2017-04-06 22:33:09 15302 7
原创 caffe源码深入学习5：超级详细的caffe卷积层代码解析

caffe实现的卷积层是一个功能强大完整，同时也是一个相对复杂的层，涉及conv_layer.hpp，conv_layer.cpp，base_conv_layer.hpp和base_conv_layer.cpp，请读者朋友们仔细欣赏~

2017-04-04 11:00:53 12494 15
原创 caffe源码深入学习2：blob.hpp+blob.cpp

在caffe源码深入学习1中我们提到了caffe.cpp文件调用用户定义的solver.prototxt文件进行网络的训练，其中，网络训练的接口是train()函数，而在train()函数中，使用了Solve()这个函数接口去求解网络参数，那么，找逻辑来说，接下来该解析solver.cpp文件，可是，事情并没有想象那么简单！如果打开solver.cpp文件，你会发现里面调用了Net相关的东西，这个

2017-02-15 21:14:15 2407 1
原创实用：使用caffe训练模型时solver.prototxt中的参数设置解析

笔者之前发布了关于解析caffe的层的博客，解析caffe常用层的博客正在不断更新中。本篇博客是一个插播的博客，目的在彻底解决使用caffe训练模型时的参数设置问题，为什么要发这篇博客呢？是因为笔者最近在自定义网络时，需要构造自己的solver.prototxt，由于之前使用别人的网络时，很多设置参数都没有变，举个例子，下面是caffe官方例程中关于训练LeNet的配置参数文件： # The t

2017-03-31 19:50:26 9704 2
原创 caffe源码深入学习6：超级详细的im2col绘图解析，分析caffe卷积操作的底层实现

在本篇博客中，笔者通过绘图，详细解析了caffe中im2col的实现，分析了卷积的底层实现原理。

2017-04-08 19:26:00 17552 28
转载 Ubuntu14.04配置caffe

前段时间接到任务需要配置caffe并在上面训练神经网络相关，对caffe与深度学习一窍不通的我有幸得到了xizero00师兄的指导并在Ubuntu 14.04上面配置了caffe，亲测可用，以下是师兄写的博文正文：网上充斥各种安装方法，但是都很繁琐，特别是安装显卡驱动以及依赖项，对于初学者不是很方便。我觉得采用包管理工具更方便因此写下本文。（1）首先安装ssh,这样

2016-09-26 16:41:47 1051 0

caffe源码深入学习4：支持魔改的layer：layer.hpp与layer.cpp

jiongnima 2017-02-20 21:26:08 技术图片

1594

分类专栏： caffe 源码解析文章标签： caffe 源码解析深度学习 layer.cpp 网络层

版权

到caffe源码深入学习3为止，我们解析了caffe底层的数据相关代码，了解了caffe这个深度学习框架中数据的存储与流通实现细节，那么，从本篇博客开始，笔者将开始解析更高层的代码，首先解析的是caffe中构成深度神经网络的网络层layer，在使用caffe架构的程序员眼中，各种layer就像一块一块的积木一般，可以通过搭建，拼接成各种各样好看的玩具城堡，同时，这些积木本身还支持各种魔改，在一个创造性强的深度学习玩家手里，各种魔改的layer搭配起来，可以解决各式各样的深度学习问题。

尽管各种魔改成果让大家眼花缭乱，可是别忘了，任何魔改积木都是从一块原木开始的，我们就先从这一块原木下手，下面按照惯例先放注释的源码。

首先是对layer.hpp的代码

#ifndef CAFFE_LAYER_H_
#define CAFFE_LAYER_H_
#include <algorithm>
#include <string>
#include <vector>
#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer_factory.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/math_functions.hpp"
/**
Forward declare boost::thread instead of including boost/thread.hpp
to avoid a boost/NVCC issues (#1009, #1010) on OSX.
*/
/*网络计算过程中的一些互斥量*/
namespace boost { class mutex; }
namespace caffe {
/**
* @brief An interface for the units of computation which can be composed into a
* Net.
*
* Layer%s must implement a Forward function, in which they take their input
* (bottom) Blob%s (if any) and compute their output Blob%s (if any).
* They may also implement a Backward function, in which they compute the error
* gradients with respect to their input Blob%s, given the error gradients with
* their output Blob%s.
*/
template <typename Dtype>
class Layer {
public:
/**
* You should not implement your own constructor. Any set up code should go
* to SetUp(), where the dimensions of the bottom blobs are provided to the
* layer.
*/
/*Layer类的构造函数，这个构造函数在用户自定义Layer的时候是不需要去修改的，该构造函数的作用主要是
对该层的模式做出初始化，是用于训练还是测试，并且在初始的时候层的参数是在Proto中的，在这里把它读到blobs_中*/
explicit Layer(const LayerParameter& param)
: layer_param_(param), is_shared_(false) {
// Set phase and copy blobs (if there are any).
phase_ = param.phase();
if (layer_param_.blobs_size() > 0) {//如果在层定义文件中有blob参数，则在这里初始化
blobs_.resize(layer_param_.blobs_size());
for (int i = 0; i < layer_param_.blobs_size(); ++i) {
blobs_[i].reset(new Blob<Dtype>());
blobs_[i]->FromProto(layer_param_.blobs(i));
}
}
}
virtual ~Layer() {}
/**
* @brief Implements common layer setup functionality.
*
* @param bottom the preshaped input blobs
* @param top
* the allocated but unshaped output blobs, to be shaped by Reshape
*
* Checks that the number of bottom and top blobs is correct.
* Calls LayerSetUp to do special layer setup for individual layer types,
* followed by Reshape to set up sizes of top blobs and internal buffers.
* Sets up the loss weight multiplier blobs for any non-zero loss weights.
* This method may not be overridden.
*/
void SetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
InitMutex();//对前向传播时的互斥量进行初始化
CheckBlobCounts(bottom, top);//检查网络传播时底层与顶层的blob形状是否规则
LayerSetUp(bottom, top);//一个非常重要的层初始化函数，往往在该函数中会进行一些对于层定义文件中的参数的读取
Reshape(bottom, top);//一个非常重要的对于层输入输出blob的初始化函数
SetLossWeights(top);//这个函数对输出blob设置loss的权重
}
/**
* @brief Does layer-specific setup: your layer should implement this function
* as well as Reshape.
*
* @param bottom
* the preshaped input blobs, whose data fields store the input data for
* this layer
* @param top
* the allocated but unshaped output blobs
*
* This method should do one-time layer specific setup. This includes reading
* and processing relevent parameters from the <code>layer_param_</code>.
* Setting up the shapes of top blobs and internal buffers should be done in
* <code>Reshape</code>, which will be called before the forward pass to
* adjust the top blob sizes.
*/
/*LayerSetUp中往往会读取层定义文件中的一些参数*/
virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {}
/**
* @brief Whether a layer should be shared by multiple nets during data
* parallelism. By default, all layers except for data layers should
* not be shared. data layers should be shared to ensure each worker
* solver access data sequentially during data parallelism.
*/
/*设置层能否被多个网络共享，只有数据层才允许被其他网络共享*/
virtual inline bool ShareInParallel() const { return false; }
/** @brief Return whether this layer is actually shared by other nets.
* If ShareInParallel() is true and using more than one GPU and the
* net has TRAIN phase, then this function is expected return true.
*/
/*在上面的函数起作用的时候，并且只是在使用多gpu训练的时候，本函数返回为true，指示该数据层被其他网络共享*/
inline bool IsShared() const { return is_shared_; }
/** @brief Set whether this layer is actually shared by other nets
* If ShareInParallel() is true and using more than one GPU and the
* net has TRAIN phase, then is_shared should be set true.
*/
/*检验层共享条件是否成立，如果不成立，那么不允许共享*/
inline void SetShared(bool is_shared) {
CHECK(ShareInParallel() || !is_shared)
<< type() << "Layer does not support sharing.";
is_shared_ = is_shared;
}
/**
* @brief Adjust the shapes of top blobs and internal buffers to accommodate
* the shapes of the bottom blobs.
*
* @param bottom the input blobs, with the requested input shapes
* @param top the top blobs, which should be reshaped as needed
*
* This method should reshape top blobs as needed according to the shapes
* of the bottom (input) blobs, as well as reshaping any internal buffers
* and making any other necessary adjustments so that the layer can
* accommodate the bottom blobs.
*/
/*一个非常重要，并且必须实现的函数，resape函数主要是协定了层输入输出的blob大小*/
virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
/**
* @brief Given the bottom blobs, compute the top blobs and the loss.
*
* @param bottom
* the input blobs, whose data fields store the input data for this layer
* @param top
* the preshaped output blobs, whose data fields will store this layers‘
* outputs
* \return The total loss from the layer.
*
* The Forward wrapper calls the relevant device wrapper function
* (Forward_cpu or Forward_gpu) to compute the top blob values given the
* bottom blobs. If the layer has any non-zero loss_weights, the wrapper
* then computes and returns the loss.
*
* Your layer should implement Forward_cpu and (optionally) Forward_gpu.
*/
/*Forward函数实现了层的前向传播，具体实现见下方*/
inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
/**
* @brief Given the top blob error gradients, compute the bottom blob error
* gradients.
*
* @param top
* the output blobs, whose diff fields store the gradient of the error
* with respect to themselves
* @param propagate_down
* a vector with equal length to bottom, with each index indicating
* whether to propagate the error gradients down to the bottom blob at
* the corresponding index
* @param bottom
* the input blobs, whose diff fields will store the gradient of the error
* with respect to themselves after Backward is run
*
* The Backward wrapper calls the relevant device wrapper function
* (Backward_cpu or Backward_gpu) to compute the bottom blob diffs given the
* top blob diffs.
*
* Your layer should implement Backward_cpu and (optionally) Backward_gpu.
*/
/*Backward函数实现了层的后向传播，具体实现见下方*/
inline void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom);
/**
* @brief Returns the vector of learnable parameter blobs.
*/
/*这里返回的是指向blob的指针容器blobs_，该容器在存储层参数的blob与proto交换时起作用*/
vector<shared_ptr<Blob<Dtype> > >& blobs() {
return blobs_;
}
/**
* @brief Returns the layer parameter.
*/
/*该函数返回prototxt文件定义的网络参数*/
const LayerParameter& layer_param() const { return layer_param_; }
/**
* @brief Writes the layer parameter to a protocol buffer
*/
/*ToProto函数将网络定义的参数存放进Proto*/
virtual void ToProto(LayerParameter* param, bool write_diff = false);
/**
* @brief Returns the scalar loss associated with a top blob at a given index.
*/
/**/
/*通过top blob的索引返回loss*/
inline Dtype loss(const int top_index) const {
return (loss_.size() > top_index) ? loss_[top_index] : Dtype(0);
}
/**
* @brief Sets the loss associated with a top blob at a given index.
*/
/*通过blob的索引设置loss*/
inline void set_loss(const int top_index, const Dtype value) {
if (loss_.size() <= top_index) {
loss_.resize(top_index + 1, Dtype(0));
}
loss_[top_index] = value;
}
/**
* @brief Returns the layer type.
*/
/*返回层类型*/
virtual inline const char* type() const { return ""; }
/**
* @brief Returns the exact number of bottom blobs required by the layer,
* or -1 if no exact number is required.
*
* This method should be overridden to return a non-negative value if your
* layer expects some exact number of bottom blobs.
*/
/*返回层底部精确blob数量*/
virtual inline int ExactNumBottomBlobs() const { return -1; }
/**
* @brief Returns the minimum number of bottom blobs required by the layer,
* or -1 if no minimum number is required.
*
* This method should be overridden to return a non-negative value if your
* layer expects some minimum number of bottom blobs.
*/
/*返回层底部blob的最少个数*/
virtual inline int MinBottomBlobs() const { return -1; }
/**
* @brief Returns the maximum number of bottom blobs required by the layer,
* or -1 if no maximum number is required.
*
* This method should be overridden to return a non-negative value if your
* layer expects some maximum number of bottom blobs.
*/
/*返回层底部blob的最多个数*/
virtual inline int MaxBottomBlobs() const { return -1; }
/**
* @brief Returns the exact number of top blobs required by the layer,
* or -1 if no exact number is required.
*
* This method should be overridden to return a non-negative value if your
* layer expects some exact number of top blobs.
*/
/*返回层顶部精确blob数量*/
virtual inline int ExactNumTopBlobs() const { return -1; }
/**
* @brief Returns the minimum number of top blobs required by the layer,
* or -1 if no minimum number is required.
*
* This method should be overridden to return a non-negative value if your
* layer expects some minimum number of top blobs.
*/
/*返回层顶部blob的最少个数*/
virtual inline int MinTopBlobs() const { return -1; }
/**
* @brief Returns the maximum number of top blobs required by the layer,
* or -1 if no maximum number is required.
*
* This method should be overridden to return a non-negative value if your
* layer expects some maximum number of top blobs.
*/
/*返回层顶部blob的最多个数*/
virtual inline int MaxTopBlobs() const { return -1; }
/**
* @brief Returns true if the layer requires an equal number of bottom and
* top blobs.
*
* This method should be overridden to return true if your layer expects an
* equal number of bottom and top blobs.
*/
/*返回层底部blob与顶部blob个数是否相等*/
virtual inline bool EqualNumBottomTopBlobs() const { return false; }
/**
* @brief Return whether "anonymous" top blobs are created automatically
* by the layer.
*
* If this method returns true, Net::Init will create enough "anonymous" top
* blobs to fulfill the requirement specified by ExactNumTopBlobs() or
* MinTopBlobs().
*/
/*如果AutoTopBlobs返回结果为真的话，那么Net会根据层最少或精确blob数量去填充顶部blob*/
virtual inline bool AutoTopBlobs() const { return false; }
/**
* @brief Return whether to allow force_backward for a given bottom blob
* index.
*
* If AllowForceBackward(i) == false, we will ignore the force_backward
* setting and backpropagate to blob i only if it needs gradient information
* (as is done when force_backward == false).
*/
/*对于bottom_index的blob，允不允许强制反传，如果AllowForceBackward为假的话，那么直接忽略force_backward*/
virtual inline bool AllowForceBackward(const int bottom_index) const {
return true;
}
/**
* @brief Specifies whether the layer should compute gradients w.r.t. a
* parameter at a particular index given by param_id.
*
* You can safely ignore false values and always compute gradients
* for all parameters, but possibly with wasteful computation.
*/
/*对于给定的参数id判断是否应该计算该参数梯度*/
inline bool param_propagate_down(const int param_id) {
return (param_propagate_down_.size() > param_id) ?
param_propagate_down_[param_id] : false;
}
/**
* @brief Sets whether the layer should compute gradients w.r.t. a
* parameter at a particular index given by param_id.
*/
/*对于给定的参数id设置是否应该计算该参数梯度*/
inline void set_param_propagate_down(const int param_id, const bool value) {
if (param_propagate_down_.size() <= param_id) {
param_propagate_down_.resize(param_id + 1, true);
}
param_propagate_down_[param_id] = value;
}
protected:
/** The protobuf that stores the layer parameters */
LayerParameter layer_param_;//layer的参数
/** The phase: TRAIN or TEST */
Phase phase_;//layer的模式，测试还是训练
/** The vector that stores the learnable parameters as a set of blobs. */
vector<shared_ptr<Blob<Dtype> > > blobs_;//存储指向层参数的指针的容器
/** Vector indicating whether to compute the diff of each param blob. */
vector<bool> param_propagate_down_;//存储对每个参数blob是否应该计算梯度的标志的容器
/** The vector that indicates whether each top blob has a non-zero weight in
* the objective function. */
vector<Dtype> loss_;//存储对于每个top blob的loss
/** @brief Using the CPU device, compute the layer output. */
/*在cpu上面的前向传播函数，非常重要，必须实现*/
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
/**
* @brief Using the GPU device, compute the layer output.
* Fall back to Forward_cpu() if unavailable.
*/
/*在gpu上面实现的前向传播函数，如果没有实现则使用cpu上的前向传播*/
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// LOG(WARNING) << "Using CPU code as backup.";
return Forward_cpu(bottom, top);
}
/**
* @brief Using the CPU device, compute the gradients for any parameters and
* for the bottom blobs if propagate_down is true.
*/
/*在cpu上面的反向传播函数，非常重要，必须实现*/
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) = 0;
/**
* @brief Using the GPU device, compute the gradients for any parameters and
* for the bottom blobs if propagate_down is true.
* Fall back to Backward_cpu() if unavailable.
*/
/*在gpu上面实现的反向传播函数，如果没有实现则使用cpu上的反向传播*/
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
// LOG(WARNING) << "Using CPU code as backup.";
Backward_cpu(top, propagate_down, bottom);
}
/**
* Called by the parent Layer‘s SetUp to check that the number of bottom
* and top Blobs provided as input match the expected numbers specified by
* the {ExactNum,Min,Max}{Bottom,Top}Blobs() functions.
*/
/*检查网络的底层与顶层blob形状是否符合规则，被SetUp函数调用*/
virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
if (ExactNumBottomBlobs() >= 0) {//检查网络中底部精确blob的数量是否为底部blob的数量
CHECK_EQ(ExactNumBottomBlobs(), bottom.size())
<< type() << " Layer takes " << ExactNumBottomBlobs()
<< " bottom blob(s) as input.";
}
if (MinBottomBlobs() >= 0) {//检查网络中底部blob的数量是否大于约定的底部最少blob的数量
CHECK_LE(MinBottomBlobs(), bottom.size())
<< type() << " Layer takes at least " << MinBottomBlobs()
<< " bottom blob(s) as input.";
}
if (MaxBottomBlobs() >= 0) {//检查网络中底部blob数量是否小于约定的底部最多blob的数量
CHECK_GE(MaxBottomBlobs(), bottom.size())
<< type() << " Layer takes at most " << MaxBottomBlobs()
<< " bottom blob(s) as input.";
}
if (ExactNumTopBlobs() >= 0) {//检查网络中顶部精确blob的数量是否为顶部blob的数量
CHECK_EQ(ExactNumTopBlobs(), top.size())
<< type() << " Layer produces " << ExactNumTopBlobs()
<< " top blob(s) as output.";
}
if (MinTopBlobs() >= 0) {//检查网络中顶部blob的数量是否大于约定的顶部最少blob的数量
CHECK_LE(MinTopBlobs(), top.size())
<< type() << " Layer produces at least " << MinTopBlobs()
<< " top blob(s) as output.";
}
if (MaxTopBlobs() >= 0) {//检查网络中顶部blob数量是否小于约定的顶部最多blob的数量
CHECK_GE(MaxTopBlobs(), top.size())
<< type() << " Layer produces at most " << MaxTopBlobs()
<< " top blob(s) as output.";
}
if (EqualNumBottomTopBlobs()) {//如果约定了底部与顶部blob的数量相同，则在此检查
CHECK_EQ(bottom.size(), top.size())
<< type() << " Layer produces one top blob as output for each "
<< "bottom blob input.";
}
}
/**
* Called by SetUp to initialize the weights associated with any top blobs in
* the loss function. Store non-zero loss weights in the diff blob.
*/
/*这个函数按照网络参数中的设置的weight对top blob进行loss的权重设置*/
inline void SetLossWeights(const vector<Blob<Dtype>*>& top) {
const int num_loss_weights = layer_param_.loss_weight_size();
if (num_loss_weights) {
CHECK_EQ(top.size(), num_loss_weights) << "loss_weight must be "
"unspecified or specified once per top blob.";
for (int top_id = 0; top_id < top.size(); ++top_id) {//对每一个top blob设置loss权重
const Dtype loss_weight = layer_param_.loss_weight(top_id);
if (loss_weight == Dtype(0)) { continue; }
this->set_loss(top_id, loss_weight);
const int count = top[top_id]->count();
Dtype* loss_multiplier = top[top_id]->mutable_cpu_diff();
caffe_set(count, loss_weight, loss_multiplier);//将每一个权重设置到相应的top blob的梯度上
}
}
}
private:
/** Whether this layer is actually shared by other nets*/
bool is_shared_;//指示该层支不支持共享
/** The mutex for sequential forward if this layer is shared */
shared_ptr<boost::mutex> forward_mutex_;//对于共享的层的前传共享数据
/** Initialize forward_mutex_ */
void InitMutex();//对于共享数据的初始化
/** Lock forward_mutex_ if this layer is shared */
void Lock();//如果层是共享的，进行锁住互斥量的操作
/** Unlock forward_mutex_ if this layer is shared */
void Unlock();//如果层是共享的，进行解锁互斥量的操作
DISABLE_COPY_AND_ASSIGN(Layer);
}; // class Layer
// Forward and backward wrappers. You should implement the cpu and
// gpu specific implementations instead, and should not change these
// functions.
/*前向传播过程*/
template <typename Dtype>
inline Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// Lock during forward to ensure sequential forward
Lock();//在前向传播时，需要锁住互斥量
Dtype loss = 0;
Reshape(bottom, top);//对底层与顶层的blob进行初始化
switch (Caffe::mode()) {
case Caffe::CPU:
Forward_cpu(bottom, top);//cpu前向传播
for (int top_id = 0; top_id < top.size(); ++top_id) {
if (!this->loss(top_id)) { continue; }//在最后一层才进行损失的计算
const int count = top[top_id]->count();
const Dtype* data = top[top_id]->cpu_data();
const Dtype* loss_weights = top[top_id]->cpu_diff();
loss += caffe_cpu_dot(count, data, loss_weights);//损失就是数据向量与对应的权重向量作点积并相加
}
break;
case Caffe::GPU:
Forward_gpu(bottom, top);//进行gpu前向传播
#ifndef CPU_ONLY//loss的计算过程与cpu方式几乎相同
for (int top_id = 0; top_id < top.size(); ++top_id) {
if (!this->loss(top_id)) { continue; }
const int count = top[top_id]->count();
const Dtype* data = top[top_id]->gpu_data();
const Dtype* loss_weights = top[top_id]->gpu_diff();
Dtype blob_loss = 0;
caffe_gpu_dot(count, data, loss_weights, &blob_loss);
loss += blob_loss;
}
#endif
break;
default:
LOG(FATAL) << "Unknown caffe mode.";
}
Unlock();//解锁互斥量
return loss;//返回loss，如不是最后一层，则loss为0
}
/*反向传播过程*/
template <typename Dtype>
inline void Layer<Dtype>::Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
switch (Caffe::mode()) {//根据模式去判断进行哪一种方式的反向传播
case Caffe::CPU:
Backward_cpu(top, propagate_down, bottom);
break;
case Caffe::GPU:
Backward_gpu(top, propagate_down, bottom);
break;
default:
LOG(FATAL) << "Unknown caffe mode.";
}
}
// Serialize LayerParameter to protocol buffer
template <typename Dtype>//将层的参数保存至Proto
void Layer<Dtype>::ToProto(LayerParameter* param, bool write_diff) {
param->Clear();
param->CopyFrom(layer_param_);
param->clear_blobs();
for (int i = 0; i < blobs_.size(); ++i) {
blobs_[i]->ToProto(param->add_blobs(), write_diff);
}
}
} // namespace caffe
#endif // CAFFE_LAYER_H_

然后是layer.cpp的代码

#include <boost/thread.hpp>
#include "caffe/layer.hpp"
namespace caffe {
template <typename Dtype>
void Layer<Dtype>::InitMutex() {//初始化互斥量
forward_mutex_.reset(new boost::mutex());
}
template <typename Dtype>
void Layer<Dtype>::Lock() {
if (IsShared()) {
forward_mutex_->lock();//若层为共享的话，互斥量上锁的操作
}
}
template <typename Dtype>
void Layer<Dtype>::Unlock() {
if (IsShared()) {
forward_mutex_->unlock();//若层为共享的话，互斥量解锁操作
}
}
INSTANTIATE_CLASS(Layer);
} // namespace caffe

乍一看layer的定义比较复杂，下面我们就依次解析一下layer的各个部分。
首先在Layer的private部分有几个变量/函数

private:
bool is_shared_;
shared_ptr<boost::mutex> forward_mutex_;
void InitMutex();
void Lock();
void Unlock();

在这里可以将layer的private部分理解为有关数据共享的相关信息变量/函数，is_shared_标志层能否共享，forward_mutex_指向层共享时共享变量的指针，而后四个函数则是有关网络前向传播时共享变量的初始化，上锁，解锁等操作，需要注意的是，只有在使用多GPU训练时，数据层才能被其他网络共享，这几个变量/函数在构造自己的layer时并不会频繁地使用。

然后我们看看网络的protected部分

protected:
LayerParameter layer_param_;
Phase phase_;
vector<shared_ptr<Blob<Dtype> > > blobs_;
vector<bool> param_propagate_down_;
vector<Dtype> loss_;
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// LOG(WARNING) << "Using CPU code as backup.";
return Forward_cpu(bottom, top);
}
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) = 0;
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
// LOG(WARNING) << "Using CPU code as backup.";
Backward_cpu(top, propagate_down, bottom);
}
virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top)
inline void SetLossWeights(const vector<Blob<Dtype>*>& top)

在protected部分就包含一些我们魔改layer，加入自己需要的功能时的东西了，首先layer_param_表示在caffe.proto中设置的有关层的参数；phase_只能为TRAIN或者TEST，表示层的模式；blobs_表示层的可学习参数的存储单元；param_propagate_down标志是否计算各个参数的梯度；loss_存储每个top blob的loss；CheckBlobCounts核验底层与顶层blob的大小是否符合规范，是否符合层的前传反传要求；SetLossWeights为各个top blob设置loss的权重。
接下来，protected中就包含了四个重要的函数了，Forward_cpu和Forward_gpu与Backward_cpu和Backward_gpu，这四个函数是用户在构造新层的时候会实现的函数，其中，Forward_cpu与Backward_cpu是必须要实现的虚函数，而剩余两者是可选实现的，注意虚函数的作用，即在实现gpu部分后，cpu部分自动不执行，否则执行cpu部分。带cpu的部分表示层是在cpu上运行（这标志着该层是在串行工作），带gpu的部分表示层是在gpu上面运行（这标志着层是在并行计算）。

接下来，我们再去看看层的public部分

public:
explicit Layer(const LayerParameter& param)
: layer_param_(param), is_shared_(false)
virtual ~Layer() {}
void SetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top)
virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {}
virtual inline bool ShareInParallel() const
inline bool IsShared() const
inline void SetShared(bool is_shared)
virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
inline void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom);
vector<shared_ptr<Blob<Dtype> > >& blobs()
const LayerParameter& layer_param() const
virtual void ToProto(LayerParameter* param, bool write_diff = false);
inline Dtype loss(const int top_index) const
inline void set_loss(const int top_index, const Dtype value)
virtual inline const char* type() const
virtual inline int ExactNumBottomBlobs() const
virtual inline int MinBottomBlobs() const
virtual inline int MaxBottomBlobs() const
virtual inline int ExactNumTopBlobs() const
virtual inline int MinTopBlobs() const
virtual inline int MaxTopBlobs() const
virtual inline bool EqualNumBottomTopBlobs() const
virtual inline bool AutoTopBlobs() const
virtual inline bool AllowForceBackward(const int bottom_index) const
inline bool param_propagate_down(const int param_id)
inline void set_param_propagate_down(const int param_id, const bool value)

在public部分中，给出了判断与设置是否计算参数梯度的开关；其次，设置了许多查询与设置顶部与底部blob的接口以及顶部blob的loss的接口；网络的可学习参数存放在blob_中，并定义了blob与Proto相交互的ToProto；同时还定义了网络共享相关的ShareInParallel()，IsShared()和SetShared()。
接下来，在public中，还有几个重要的常用函数，首先是layerSetup函数，往往在自定义层时，在该函数中进行layer的各个参数初始化；然后是Reshape函数，这个函数在自定义层时对layer的底部和顶部的blob形状进行初始化，协定层传播计算时的数据规格。此外还有Forward函数，该函数负责层的前向传播过程，并判断若该层是网络的最后一层的话，就进行loss的计算；Backward函数则进行层的反传操作。
值得注意的是，构造函数在我们自定义层时是不需要去修订或者重写的，Layer提供了LayerSetup和Reshape供用户初始化；构造函数的作用是，获取层的模式，并且将层定义文件中用户定义的权重和偏置放置于blob_中，可是往往我们在定义层时不定义权重与偏置，因此并没有在构造函数中初始化。Layer中的更重要的一个SetUp函数把层的很多初始化集成了起来，并将在Net初始化时进行调用，笔者将在后话解析。
在layer.cpp中，主要是定义了对层共享时前传所使用的互斥量进行的操作。

到此为止，layer.hpp与layer.cpp便解析结束了，值得注意的是，这只是一块没有经过雕琢的自然原木，因此，很多独有特征并没有被显示出来。接下来，笔者还打算更新几篇讲解caffe中几个常用层的博客，这样能从代码中完全体会深度神经网络中层的运算过程。不过，无论派生的自定义层多么高大上，这块原木一直默默伫立在这里，约定了层的结构设置与计算过程。因此，在构造层的时候，一定要注意参考这个父类，不要只是盲目借鉴其他的派生成品。
总的来说，各种Layer像一块块积木，搭建了深度神经网络这个大魔幻城堡，layer的构造严谨，支持继承并且完善用户自己的功能，用户修改操作简单，接口封装性好。
欢迎阅读笔者后续解析常用层的博客，各位读者朋友的支持与鼓励是我最大的动力！

written by jiong
To be a geek

caffe源码深入学习

标签：relative clear diff mutex 灵活 original 文章探索 nts

原文地址：https://www.cnblogs.com/cx2016/p/13778371.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

caffe源码深入学习

作者:jiongnima

原创 干货！caffe源码深入学习9：caffe框架神经网络反传代码解析（三）之contrastive_loss_layer源码解析

原创 caffe初探4：对训练得到的模型进行测试

原创 tensorflow2caffe(2) : 如何在tensorflow中取出模型参数

原创 tensorflow2caffe(3) : 如何将tensorflow框架下训练得到的权重转化为caffe框架下的权重参数

原创 tensorflow2caffe(1) : caffemodel解析，caffemodel里面到底记录了什么？

原创 tensorflow2caffe(4) : caffemodel的生成与tensorflow2caffe框架转换的总结

原创 FCN训练不收敛的原因分析和最详细的FCN训练与测试自己的数据程序配置

原创 详细的Faster R-CNN源码解析之ROI-Pooling逐行代码解析

原创 重启caffe源码深入学习7：caffe框架深度神经网络反传代码解析（一）之ReLU层源码解析

原创 caffe源码深入学习4：支持魔改的layer：layer.hpp与layer.cpp

原创 caffe源码深入学习8：caffe框架深度神经网络反传代码解析（二）之pooling层源码解析

原创 在c++程序中调用caffe训练完毕的模型进行分类

原创 caffe工程化实例：将caffe分类程序编译成.so文件并在程序中调用

原创 nvidia jetson TX2配置caffe

原创 caffe初探3：结合数据集与设计的网络模型进行训练

原创 caffe初探2：有关网络设计的探索

原创 caffe初探1：生成自己的数据集

原创 caffe源码深入学习3：更底层的数据信息存取与交换代码：syncedmem.hpp和syncedmem.cpp

原创 caffe源码深入学习1：caffe.cpp解析

原创 caffe卷积层延伸：卷积核膨胀详细解析

原创 caffe源码深入学习5：超级详细的caffe卷积层代码解析

原创 caffe源码深入学习2：blob.hpp+blob.cpp

原创 实用：使用caffe训练模型时solver.prototxt中的参数设置解析

原创 caffe源码深入学习6：超级详细的im2col绘图解析，分析caffe卷积操作的底层实现

转载 Ubuntu14.04配置caffe

caffe源码深入学习4：支持魔改的layer：layer.hpp与layer.cpp

原创干货！caffe源码深入学习9：caffe框架神经网络反传代码解析（三）之contrastive_loss_layer源码解析

原创详细的Faster R-CNN源码解析之ROI-Pooling逐行代码解析

原创重启caffe源码深入学习7：caffe框架深度神经网络反传代码解析（一）之ReLU层源码解析

原创在c++程序中调用caffe训练完毕的模型进行分类

原创实用：使用caffe训练模型时solver.prototxt中的参数设置解析