Background & Introduction
Implementation
Software Organization
How to Use
Additional Features/Extension
Impact and Inclusivity
Future Work & Other Possible Extensions
Orginal Function
$ f(x,y) = 3xy$ at $(x,y) = (2,3)$
$ f(2,3) = 3 \times 2 \times 3 = 18$
Solve Taking Partial Derivatives
$ \frac{\partial f}{\partial x} (x,y) = 3y $
$ \frac{\partial f}{\partial y} (x,y) = 3x $
$ \frac{\partial f}{\partial x} (2,3) = 9 $
$ \frac{\partial f}{\partial y} (2,3) = 6 $
Current Value is the same as The original Function
$f(2,3) = 3 \times 2 \times 3 = 18 = v_4 $
Same Answer as Partial Derivatives
$\Delta_x = 9 = \frac{\partial f}{\partial x} (2,3)$
$\Delta_y = 6 = \frac{\partial f}{\partial y} (2,3)$
$ v_1 = x = X_1$
$ v_2 = y = X_0$
$ v_3 = 3v_1 = V_1$
$ v_4 = f(x,y) = v_2 \times v_3 = f$
README.md
demos/ Final Presentation with appropriate helper files
Presentation.ipynb Final Presentatoin
...
docs/ Contains the documentation for the project.
README.md
documentation.ipynb The main documentation.
milestone1.ipynb
milestone2.ipynb
milestone2_progress.ipynb
...
code/ Source files
AADUtils.py Utilities used as support functions (internal)
AAD.py Main constructor and dual number forward-mode implementation=
AADFunction.py Vector-valued functions wrapper
...
solvers/
GradientDescent.py Gradient-descent optimizer
Newton.py Newton root finder (vector-valued, linear and nonlinear functions)
GMRES.py Generalized Minimum-Residual Method root finder (vector-valued, linear functions)
tests/ Contains the test suite for the project
run_tests.py Runs all of the tests it the folder
test_funtion.py Tests Vector Valued Functions of AD Variables
test_gmres.py Tests for GMRES
test_graddesc.py Tests for gradient descent
test_main.py Contains operations overrides and tests of complex functions for one variable
test_newton.py Tests for newton's method
test_utils.py Tests internal utilities
exp(x)
euler's number to the power of xlog(x)
natural log of xsin(x)
sine of xsinh(x)
hyperbolic sin of xcos(x)
cosine of x cosh(x)
hyperbolic cosine of xtan(x)
tangent of xtanh(x)
hyperbolic tangent of x arcsin(x)
inverse sine of xarccos(x)
inverse cosine of xarctan(x)
inverse tangent of xsqrt(x)
square root of x__rmul__
, __mul__
, __neg__
, __add__
, __radd__
, __sub__
, __rsub__
__truediv__
, __rtruediv__
, __pow__
, __rpow__
, __repr__
These overrides accept real numbers, but do not assume real numbers. They can be used with multile AADVariable Objects
Thse overriden operators allow for seamless operations and combindations of operations that include:
2**x
or x**2
x*2
or 2*x
or sin(x)*cos(x)
x-3
or 3-x
or sin(x)-cos(x)
x/4
or 4/x
or sin(x)/cos(x)
x+5
or 5+x
or sin(x)+cos(x)
git clone
this package repository into your project directory.pip
: pip install -r requirements.txt
from AAD import as AD
for AAD objectsfrom AADFunction import as ADF
for vector valued functionsfrom solvers.Newton import AAD_Newton
for Newton's Methodfrom solvers.GradientDescent import AAD_grad
for Gradient Descentfrom solvers.GMRES import AAD_GMRES
for GMRESAs a quick reference, you can install AAD
with conda
with a new environment using the following commands:
git clone git@github.com:dist-computing/cs107-FinalProject.git AAD
cd AAD
conda create --name AAD python=3.8
conda install -n AAD requirements.txt
To activate the environment later, simply use source activate AAD
.
To install using PyPI via pip
, AAD
is the package namespace, and thus import is slightly different:
$ python -m pip install --upgrade aad-jimmielin
$ python
>>> from AAD.AADVariable import AADVariable
>>> x = AADVariable(1.0, [1,0])
>>> y = AADVariable(2.0, [0, 1])
>>> x*y
AADVariable fun = 2.0, der = [2. 1.]
>>> from AAD.AADFunction import AADFunction
>>> f = AADFunction([x+1, y*x])
>>> print(f)
[AADFunction fun = [2.0, 2.0], der = [[1. 0.]
[2. 1.]]]
To efficiently publish the github pages website we further streamlined the work with bash scripting. In the script
included in the package a person can keep two folders with two branches, one for github pages and one for the gh-pages.
By running the bash script below, you can update the website based on the documentation file. ./UpdateIndexFromDocomentation.sh
#!/bin/bash
cd ../website/
#copy files from main repo
cp ../cs107-FinalProject/docs/documentation.ipynb ./documentation.ipynb
#convert notebook to markdown
jupyter nbconvert $(pwd)/documentation.ipynb --to markdown --output $(pwd)/index.md
#add title to file
sed -i '1s/^/---\n/' index.md
sed -i '1s/^/ title: About the code \n/' index.md
sed -i '1s/^/---\n/' index.md
#commit, push and publish
git add index.md documentation.ipynb
git commit -m 'updated index.md with documentation <automated>'
git push
import sys
sys.path.insert(1, '../AAD/')
#importing Dependencies
import numpy as np
import AAD as AD
import AADFunction as ADF
import math
Function & Evaluation
$ f(x) = \ln(x) $ at x = 4
$ f(x) = \ln(4) = 1.3862943611198906 $
Derivative & Evaluation
$ f'(x) = \frac{{1}}{x} $
$ f'(4) = \frac{{1}}{4} = 0.25 $
x = 4 #initial value of x
my_AD = AD.AADVariable(x) #creating the AAD object
my_AD = AD.log(my_AD) #applying a function to the object to find both the value and derivative
#Prints value of log and derivative at x = 4
print(my_AD)
AADVariable fun = 1.3862943611198906, der = 0.25
Function & Evaluation
$ f(x) = e^{tan(x)}$ at x = $\pi$
$ f(\pi) = e^{tan(\pi)} = 1 $
Derivative & Evaluation
$ f'(x) = \frac{{1}}{cos^2(x)} \times e^{tan(x)} = \frac{{e^{tan(x)}}}{cos^2(x)}$
$ f'(\pi) = \frac{{e^{tan(\pi)}}}{cos^2(\pi)} = 1$
x = math.pi #initial value of x
my_AD = AD.AADVariable(x) #creating the AAD object
my_AD = AD.exp(AD.tan(my_AD)) #applying a function to the object to find both the value and derivative
#Prints value of e^tan(x) and derivative at x = pi
print(my_AD)
AADVariable fun = 0.9999999999999999, der = 0.9999999999999999
Function & Evaluation
$ f(x,y) = x + y - 6 $ at x = 1.0 and y = 2.0
$ f(1,2) = 1.0 + 2.0 - 6 = - 3.0$
Partial Derivative with respect to X
$ \frac{\partial f}{\partial x} (x,y) = 1.0 $
$ \frac{\partial f}{\partial x} (1.0,2.0) = 1.0 $
Partial Derivative with respect to Y
$ \frac{\partial f}{\partial y} (x,y) = 1.0 $
$ \frac{\partial f}{\partial y} (1.0,2.0) = 1.0 $
Derivative Evaluation
$ f'(1.0,2.0) = \begin{bmatrix} 1.0 & 1.0 \end{bmatrix}$
### Multi Dimensional Example - simple case
x = AD.AADVariable(1.0, [1, 0]) # x = 1.0
y = AD.AADVariable(2.0, [0, 1]) # y = 2.0
scalar = 6.0
f = x + y - scalar
print(f)
AADVariable fun = -3.0, der = [1 1]
Function & Evaluation
$ f(x,y) = \frac{x+ \frac{y}{2}}{10} $ at x = 1.0 and y = 2.0
$ f(1.0,2.0) = \frac{1.0+ \frac{2.0}{2}}{10} = 0.2$
Partial Derivative with respect to X
$ \frac{\partial f}{\partial x} (x,y) = 1.0/10 $
$ \frac{\partial f}{\partial x} (1.0,2.0) = 0.1 $
Partial Derivative with respect to Y
$ \frac{\partial f}{\partial y} (x,y) = 0.5/10 $
$ \frac{\partial f}{\partial y} (1.0,2.0) = 0.05 $
Derivative Evaluation
$ f'(1.0,2.0) = \begin{bmatrix} 0.1 & 0.05 \end{bmatrix}$
#Advancing Multi Dimensional using previously declared AAD Variables
print((x + 0.5*y)/10) # 0.1x+0.05y
AADVariable fun = 0.2, der = [0.1 0.05]
Function & Evaluation
$ f(x,y,z) = x \times y \times z$ at x = 2.0, y = 1.0 and z = 4.0
$ f(2.0,1.0,4.0) = 2.0 \times 1.0 \times 4.0 = 8.0$
Partial Derivative with respect to X
$ \frac{\partial f}{\partial x} (x,y,z) = y \times z$
$ \frac{\partial f}{\partial x} (2.0,1.0,4.0) = 1.0 \times 4.0 = 4.0$
Partial Derivative with respect to Y
$ \frac{\partial f}{\partial y} (x,y,z) = x \times z $
$ \frac{\partial f}{\partial y} (2.0,1.0,4.0) = 2.0 \times 4.0 = 4.0 = 8.0$
Partial Derivative with respect to Z
$ \frac{\partial f}{\partial z} (x,y,z) = x \times y $
$ \frac{\partial f}{\partial z} (2.0,1.0,4.0) = 2.0 \times 1.0 = 2.0$
Derivative Evaluation
$ f'(2.0,1.0,4.0) = \begin{bmatrix} 4.0 & 8.0 & 2.0 \end{bmatrix}$
The AADVariable parser automatically zero-pads required derivative arrays, thus both of the following syntaxes work
### Multi Dimensional Example - complex case
x = AD.AADVariable(2.0, [1, 0, 0]) # x = 2.0
y = AD.AADVariable(1.0, [0, 1, 0]) # y = 1.0
z = AD.AADVariable(4.0, [0, 0, 1]) # z = 4.0
f = x*y*z
print(x*y*z)
x = AD.AADVariable(2.0, [1]) # x = 2.0
y = AD.AADVariable(1.0, [0, 1]) # y = 1.0
z = AD.AADVariable(4.0, [0, 0, 1]) # z = 4.0
print('We get the same result')
f = x*y*z
print(x*y*z)
AADVariable fun = 8.0, der = [4. 8. 2.] We get the same result AADVariable fun = 8.0, der = [4. 8. 2.]
Function & Evaluation
$f(x,y,z) = \begin{bmatrix} x+y \times z\\ z \times x-y \end{bmatrix}$ at x = 3.0, y = 2.0, and z=1.0*
$f(3.0,2.0,1.0) = \begin{bmatrix} 3.0+2.0 \times 1.0\\ 1.0 \times 3.0-2.0 \end{bmatrix} = \begin{bmatrix} 5.0\\ 1.0 \end{bmatrix} $
Partial Derivative with respect to X
$\frac{\partial f}{\partial x} (x,y,z) = \begin{bmatrix} 1.0\\ z \end{bmatrix}$
$\frac{\partial f}{\partial x} (3.0,2.0,1.0) = \begin{bmatrix} 1.0\\ 1.0 \end{bmatrix}$
Partial Derivative with respect to Y
$\frac{\partial f}{\partial y} (x,y,z) = \begin{bmatrix} z\\ -1.0 \end{bmatrix}$
$\frac{\partial f}{\partial y} (3.0,2.0,1.0) = \begin{bmatrix} 1.0\\ -1.0 \end{bmatrix}$
Partial Derivative with respect to Z
$\frac{\partial f}{\partial Z} (x,y,z) = \begin{bmatrix} y\\ x \end{bmatrix}$
$\frac{\partial f}{\partial Z} (3.0,2.0,1.0) = \begin{bmatrix} 2.0\\ 3.0 \end{bmatrix}$
Derivative Evaluation
$ f'(2.0,1.0,4.0) = \begin{bmatrix} 1.0 & 1.0 & 2.0\\ 1.0 & -1.0 & 3.0 \end{bmatrix}$
x = ADF.AADVariable(3.0, [1]) # either not specifying der, [1], or [1 0] or [1,0,0] will all work, as above
y = ADF.AADVariable(2.0, [0, 1])
z = ADF.AADVariable(1.0, [0, 0, 1])
f_1 = x+(y*z)
f_2 = (z*x)-y
f = ADF.AADFunction([f_1, f_2])
print(f)
[AADFunction fun = [5.0, 1.0], der = [[ 1. 1. 2.] [ 1. -1. 3.]]]
uses AAD
's automatic differentiation core to solve systems of linear equations.
Given a system of $k$ linear equations for the function $F: \mathbb{R}^k \rightarrow \mathbb{R}^k$ with a nonsingular Jacobian matrix $J_F(x_n)$
Is all that is needed for GMRES to solve for $J_F(x_k)\Delta x_{k} = -F(x_k)$, and thus does not require the computation of the Jacobian itself.
Using this principle we can iteratively solve for $\Delta x_k$ and improving the guess until convergence.
#GMRES
x = AD.AADVariable(3, [1 ,0])
initial_guess = [30]
solve = AAD_GMRES.solve(lambda x: [3*x-2], initial_guess)
print(solve[0])
0.6666666666666679
AAD
's automatic differentiation core to solve systems of linear and non-linear equations. Inverting the Jacobian is expensive and unstable
solving the following linear system is more practical:
We implement this using our automatic differentiation core, which retrieves the Jacobian $J_F(x_n)$ accurate to machine precision
In the future we can also expand Newton's Method solver to a "least squares sense" solution problem by using a generalized inverse of a non-square Jacobian matrix $J^\dagger = (J^TJ)^{-1}J^T$ for non-linear functions, thus expanding our optimization toolbox.
#Newton's Method
x = AD.AADVariable(3, [1 ,0])
initial_guess = [30]
solve = AAD_Newton.solve(lambda x: [3*x-2], initial_guess)
print(solve)
[0.6666666666666666]
Where $a_{n+1}$ is the new point based on the previous point $a_n$ updated by the gradient of the function evaluated at the previous point $$\gamma\nabla F(a_n)$$, where $$\gamma$$ provides the step size.
#Gradient Descent
x = AD.AADVariable(3, [1 ,0])
gamma = .0001
solve = AAD_grad.solve(lambda x: AD.abs(3*x[0]-2), [x], gamma)
print(solve[0].val)
0.6665999999995169
The implementation of backwardpropagation of neural networks is essnetiall a special case of reverse mode AD with an added set of elementary functions. These elementary functions are called activation functions some of which are already implemented in this version of the software.
Here is a comprehensive list of new and already implemented functions we plan on incorporating into the future version of this software.
New Functions: relu
, leaky relu
, sigmoid
, swish
, softmax
, binary step
Already implemented: linear
, tanh
Background & Introduction
Implementation/Software Organization/How to Use
Additional Features/Extension
Impact and Inclusivity
Future Work & Other Possible Extensions
benjaminmanning@hks.harvard.edu
fernandes@g.harvard.edu
hplin@g.harvard.edu