Simplistically, the GPU takes in a stream of vertices into a vertex function. That vertex function outputs a position. That’s the only really important thing about the vertex function.

The rasteriser takes those positions and fills out triangles in 2d. If you think of that 2d as a grid, then conceptually the triangles cover squares (fragments) in that grid.

The fragment function takes in the fragments and assigns a color to that fragment.

So vertex function is for position, fragment function is for color. Anything else is extra.

Source: Face normal vs vertex normal – Books / Metal by Tutorials – raywenderlich.com Forums