<P> The convolutional layer is the core building block of a CNN . The layer's parameters consist of a set of learnable filters (or kernels), which have a small receptive field, but extend through the full depth of the input volume . During the forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a 2 - dimensional activation map of that filter . As a result, the network learns filters that activate when it detects some specific type of feature at some spatial position in the input . </P> <P> Stacking the activation maps for all filters along the depth dimension forms the full output volume of the convolution layer . Every entry in the output volume can thus also be interpreted as an output of a neuron that looks at a small region in the input and shares parameters with neurons in the same activation map . </P> <P> When dealing with high - dimensional inputs such as images, it is impractical to connect neurons to all neurons in the previous volume because such a network architecture does not take the spatial structure of the data into account . Convolutional networks exploit spatially local correlation by enforcing a local connectivity pattern between neurons of adjacent layers: each neuron is connected to only a small region of the input volume . The extent of this connectivity is a hyperparameter called the receptive field of the neuron . The connections are local in space (along width and height), but always extend along the entire depth of the input volume . Such an architecture ensures that the learnt filters produce the strongest response to a spatially local input pattern . </P> <P> Three hyperparameters control the size of the output volume of the convolutional layer: the depth, stride and zero - padding . </P>

First layer in cnn is never a convolutional layer