Profile picture
Kishan Chakraborty
Research Scholar
Home
Blog
Back

Convexity

Convex functions are some of the most studied functions in mathematics and engineering. In this article we will explore a wide range of convex functions that are studied in mathematical optimization. Mathematically a function f:K→Rf : \mathcal{K} \rightarrow \mathbb{R}f:K→R is called convex if for all
α∈(0,1) and x1,x2∈K\alpha \in (0,1) \text{ and } x_1, x_2 \in \mathcal{K}α∈(0,1) and x1​,x2​∈K

f(αx1+(1−α)x2)≤αf(x1)+(1−α)f(x1)f(\alpha x_1 + (1-\alpha)x_2) \le \alpha f(x_1) + (1-\alpha) f(x_1)f(αx1​+(1−α)x2​)≤αf(x1​)+(1−α)f(x1​)

In common terms, a line segment joining any two points in the domain of fff always lie on or above the curve generated by fff. In other words, fff is atleast linear everwhere.

1. Properties of convex functions

  1. A convex function may not have a local minima e.g.,exe.g., e^xe.g.,ex. Even if a local minima exist, it may not be unique.
  2. If a convex function has more than one local minima say x1x_1x1​ and x2x_2x2​, all points in the line segment joining x1x_1x1​ and x2x_2x2​ are local minima.
  3. If a convex function has a unique local minima, it is a global minima.
  4. The average of the functional values of some randomly selected points sits on top of the function at the average of these points. This is called Jensen’s inequality
f(E[X])≤E[f(X)]f(\mathbb{E}[X]) \le \mathbb{E}[f(X)]f(E[X])≤E[f(X)]
  1. A differentiable function is convex iff the tangent at any point x∈Kx \in \mathcal{K}x∈K lies below the function everywhere else.
f(y)≥f(x)+∇f(x)T(y−x)f(y) \ge f(x) + \nabla f(x)^T (y-x)f(y)≥f(x)+∇f(x)T(y−x)
  1. Non-negative weighted sum of convex functions is always convex.
  2. Function composition of a convex function with a affine mapping results in a convex function.
  3. For convex functions {fi},i∈[n]\{f_i\}, i \in [n]{fi​},i∈[n], the point wise minima f(x)=min⁡i{fi(x)}f(x) = \min_{i}\{f_i(x)\}f(x)=mini​{fi​(x)} is convex.

Although a convex function has ‘nice’ properties, they are not sufficient to guarantee some result. For example, a iterative algorithm is guaranteed to reach a local minima but it may not be the most desired. In such cases, we need a stronger convexity called stong convex functions.

2. Strong Convex Functions

A strong convex function has a strictly positive lower bound on its curvature, i.e., it grows atleast as fast as a quadratic function everywhere. A α\alphaα-strongly convex function is defined as

f(y)≥f(x)+∇f(x)T(y−x)+α2∣∣y−x∣∣2f(y) \ge f(x) + \nabla f(x)^T(y-x) + \frac{\alpha}{2} ||y-x||^2f(y)≥f(x)+∇f(x)T(y−x)+2α​∣∣y−x∣∣2

This means for a α\alphaα-convex function f,f−α2x2f, f-\frac{\alpha}{2}x^2f,f−2α​x2 is still a convex function. Some convex functions like linear functions, ex,∣x∣,x4e^x, |x|, x^4ex,∣x∣,x4 and higher order even polynomial function are not strongly convex. Exponential function exe^xex becomes flat as x→−∞,x4x \rightarrow -\infty, x^4x→−∞,x4 is flat around 000 therefore, they are not strongly convex. Surprisingly, there in an infinite numbers of family of strongly convex functions.

2.1. Properties of strongly convex functions

  1. A strongly convex function has a unique global minima.
  2. Unlike a convex function that only stays above the tangent line, a strongly convex function also stays above a parabola.
  3. When ∣∣x∣∣→∞||x|| \rightarrow \infty∣∣x∣∣→∞, a strongly convex function f→∞f \rightarrow \inftyf→∞.