Multi-layer perceptrons approach the multivariate logistic regression problem by learning the appropriate features at the same time as their coefficients.
The cost function for stochastic gradient descent (SGD) considers the cost of a single example, randomly-chosen. Mini-batch gradient descent considers a fraction of the examples.
TensorFlow is a fast, efficient open source library that can automatically generate partial derivative functions from the definition of a complex cost function.